SlideShare a Scribd company logo
1 of 29
Download to read offline
3/18/2010ISPD 20101
Accurate Clock Mesh Sizing via
Sequential Quadratic Programming
Venkata Rajesh Mekala,Yifang Liu, XiaojiYe, Jiang Hu, Peng Li
Department of ECE
Texas A&M University
OUTLINE
3/18/2010ISPD 20102
 Introduction
 PreviousWorks
 Problem Formulation
 Algorithm Overview
 Results
 Conclusions
Clock source
Flip flops
Local trees
Clock Architectures
Clock Tree
• low cost (wiring, power, cap)
• higher skew, jitter than mesh
• widely used in ASIC designs
• clock gating easy to incorporate
Flip-flops
Flip flops
tree
crosslink
crosslink
flip flops
Clock source
Hybrid: tree + cross-links
• low cost (wiring, power, cap)
• smaller skew, jitter than tree*
• difficult to analyze
Hybrid: mesh + local trees
Clock Mesh
• excellent for low skew, jitter
• high power, area, capacitance
• difficult to analyze
• clock gating not easy
• used in modern processors
3/18/20103 ISPD 2010
Clock Mesh
3/18/2010ISPD 20104
 Clock mesh architecture is very effective in reducing
skew variation.
 Clock mesh is difficult in analyzing with sufficient
accuracy.
 It dissipates higher power compared to other
architectures.
 The challenge is to design the mesh with less power
meeting the skew constraints.
3/18/2010ISPD 20105
Clock Distribution
Networks
ClockTrees Crosslinks Clock Mesh
Pullela, Menezes and Pileggi
Moment-sensitivity-based wire sizing
for skew reduction
1997
Guthaus, Sylvester and Brown
Clock buffer and wire sizing using
sequential programming
2006
Wang, Ran, Jiang and Sadowska
General skew constrained
clock network sizing based on
sequential linear programming
2005
Rajaram, Hu and Mahapatra
Reducing clock skew variability via
crosslinks
2006
Samanta, Hu and Li
Discrete buffer and wire sizing for
link-based
non-tree clock networks
2008
Desai, Cvijetic and Jensen
Sizing of clock distribution networks
for high performance CPU chips
1996
Rajaram and Pan
MeshWorks: An efficient framework
for planning, synthesis and
optimization of clock mesh networks
2008
Venkataraman, Feng, Hu and Li
Combinatorial algorithms for fast
clock mesh optimization
2006
Motivation & Our Contributions
3/18/2010ISPD 20106
 Current-source based gate modeling approach to
speedup the accurate analysis of clock mesh.
 Efficient adjoint sensitivity analysis to provide desirable
sensitivities.
 Algorithm based on rigorous SQP.
 First clock mesh sizing method that does systematic
solution search and is based on accurate delay model
Problem Formulation
3/18/2010Texas A&M University7
I is the set of interconnects
in the clock mesh
xi ; i Є I is the width of
element i in the
interconnect set
wi ; i Є I is the area of
element i in the
interconnect set
S is the set of sinks
or local trees
dj ; j Є S the propagation
delay of the signal from
the root of the
clock tree to sink j D is the coefficient
vector reflecting the
linear size-area
relation
µ is the average value of
the sink delays and δ is
the given maximum variance
Lx and Ux represent the
lower bound and upper
bound vectors of the wires
Problem Formulation
3/18/2010ISPD 20108
total clock mesh area
skew constraint in the variance
form
lower bound, upper bound vectors
of the wire widths
Higher wire area leads to a higher load capacitance for the clock buffers which in
turn implies a higher power dissipation.
Constraint in the quadratic form is a differentiable function
Solving the Problem
3/18/2010ISPD 20109
 Lagrangian of the original
problem:
 Gradient vector of the
Lagrangian function
is be obtained by circuit
simulation and adjoint sensitivity
analysis
Solving the Problem
3/18/2010ISPD 201010
 Lagrangian of the original
problem:
 Gradient vector of the
Lagrangian function
The adjoint sensitivity analysis
gives us the values of
Solving the Problem
3/18/2010ISPD 201011
 Lagrangian of the original
problem:
 Gradient vector of the
Lagrangian function
The sensitivities with respect to
wire widths are calculated with the
help of chain rule:
Solving the Problem
3/18/2010ISPD 201012
 Lagrangian of the original
problem:
 Gradient vector of the
Lagrangian function
 Necessary conditions for
any optimal point of the
problem – KKT
conditions
Common way to solve this
equation is by Newton’s
method.
Solving the Problem
3/18/2010ISPD 201013
 Let the Newton step in
iteration k of solving the
equation be:
x, λ are variables in the
equation.
px,k and pλ,k are the vectors
representing change in width
of wires and Lagrangian
multiplier.
Solving the Problem
3/18/2010ISPD 201014
 Let the Newton step in
iteration k of solving the
equation be:
 Jacobian of the equation
is:
 Hessian of the Lagrangian
function:
 Newton step calculation
implies that px,k and pλ,k
satisfy the following
system:
Solving the Problem
3/18/2010ISPD 201015
 Newton step calculation
implies that px,k and pλ,k
satisfy the following
system:
 Adjusting the above
equation gives us:
 This equation is solved by:
 Minimize:
 Subject to:
Solving the QP sub-problem
3/18/2010ISPD 201016
 The QP sub-problem to
be solved as a part of SQP
is:
Minimize:
Subject to:
and
Solving the QP sub-problem
3/18/2010ISPD 201017
 The QP sub-problem to
be solved as a part of SQP
is:
Minimize:
Subject to:
and
through sensitivity analysis we
obtain the gradient.
the sensitivities with respect to
wire widths are calculated with the
help of chain rule:
Solving the QP sub-problem
3/18/2010ISPD 201018
 The QP sub-problem to
be solved as a part of SQP
is:
Minimize:
Subject to:
and
we use quasi-newton (BFGS )
method to approximate the
hessian in each iteration
Sensitivity Analysis
3/18/2010ISPD 201019
 Sensitivity information of the original circuit obtained by
convolution-like computation between transient
waveforms of the original and the adjoint circuit.
 Compact gate model provides up to two orders of
magnitude speedup over SPICE simulation while
maintaining the same level of accuracy.
P. Li, Z. Feng and E.Acar.“Characterizing multistage nonlinear drivers and
variability for accurate timing and noise analysis". In IEEE Trans.Very Large
Scale Integration, pp 205 - 214, November 2007.
X.Ye and P. Li. “An application-specic adjoint sensitivity analysis framework
for clock mesh sensitivity computation". In Proc. of IEEE International
Symposium on Quality Electronic Design, pp 634 - 640, 2009.
CMSSQP Framework
3/18/2010ISPD 201020
Initialization of the design
(No. of buffers, benchmark and clock mesh)
Generate spice netlist
Sensitivity Analysis
(Sensitivities of the 𝜎2 with respect to wire widths)
Quasi-Newton approximation of Hessian
Optimization
Formulate and Solve the Quadratic Programming
sub-problem
Update the widths of the clock mesh
Transient Simulation
(Compute the delays, slew to every sink node)
Convergence
criterion met?STOP
C++
MOSEK
SPICE
YES NO
SPICE
Results
3/18/2010ISPD 201021
 Experimental Setup
 65nm technology transistor
models for the buffers
 (m rows X n columns) mesh
 Max skew
 Linux platform having two
Intel Xeon E5410 quad-cores
 ISCAS, ISPD benchmarks
 Widths limited
Initial clock mesh design
3/18/2010ISPD 201022
Results after executing CMSSQP
3/18/2010ISPD 201023
Summary: Reduction in area
3/18/2010ISPD 201024
Area-skew tradeoff by varying δ
3/18/2010ISPD 201025
ISPD:
ispd09f11
Case(a): (σ2 < δ), σ2 , total clock mesh area in
each iteration
3/18/2010ISPD 201026
Case(b): (σ2 > δ), σ2 , total clock mesh area in
each iteration
3/18/2010ISPD 201027
Conclusions & Future work
3/18/2010ISPD 201028
 Presented an algorithm for reduction of clock mesh area
satisfying specified skew constraints in a clock mesh.
 Robust in dealing with any complex clock mesh network.
 First clock mesh sizing method that does systematic
solution search and is based on accurate delay model.
 Experimental results achieved about 33% reduction in
clock mesh area.
 Can be extended to size interconnects, mesh buffers
simultaneously.
3/18/2010ISPD 201029
Thanks

More Related Content

What's hot

Embedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual ReviewEmbedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual ReviewSudhanshu Janwadkar
 
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...VLSI SYSTEM Design
 
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...Arun Joseph
 
Physical design-complete
Physical design-completePhysical design-complete
Physical design-completeMurali Rai
 
Study of inter and intra chip variations
Study of inter and intra chip variationsStudy of inter and intra chip variations
Study of inter and intra chip variationsRajesh M
 
Lecture20 asic back_end_design
Lecture20 asic back_end_designLecture20 asic back_end_design
Lecture20 asic back_end_designHung Nguyen
 
tau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingtau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingTom Spyrou
 
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGALOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGAVLSICS Design
 
Implementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsImplementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsM Mei
 
Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisArunaRavi
 
IRJET- An Efficient and Low Power Sram Testing using Clock Gating
IRJET-  	  An Efficient and Low Power Sram Testing using Clock GatingIRJET-  	  An Efficient and Low Power Sram Testing using Clock Gating
IRJET- An Efficient and Low Power Sram Testing using Clock GatingIRJET Journal
 
Understanding cts log_messages
Understanding cts log_messagesUnderstanding cts log_messages
Understanding cts log_messagesMujahid Mohammed
 
An fpga based efficient fruit recognition system using minimum
An fpga based efficient fruit recognition system using minimumAn fpga based efficient fruit recognition system using minimum
An fpga based efficient fruit recognition system using minimumAlexander Decker
 
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...IRJET Journal
 

What's hot (20)

Embedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual ReviewEmbedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual Review
 
Floor plan & Power Plan
Floor plan & Power Plan Floor plan & Power Plan
Floor plan & Power Plan
 
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
 
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
 
Physical design-complete
Physical design-completePhysical design-complete
Physical design-complete
 
Chapter1.slides
Chapter1.slidesChapter1.slides
Chapter1.slides
 
Study of inter and intra chip variations
Study of inter and intra chip variationsStudy of inter and intra chip variations
Study of inter and intra chip variations
 
Lecture20 asic back_end_design
Lecture20 asic back_end_designLecture20 asic back_end_design
Lecture20 asic back_end_design
 
Dr.s.shiyamala fpga ppt
Dr.s.shiyamala  fpga pptDr.s.shiyamala  fpga ppt
Dr.s.shiyamala fpga ppt
 
tau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingtau 2015 spyrou fpga timing
tau 2015 spyrou fpga timing
 
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGALOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA
 
Implementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsImplementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew Groups
 
Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S Thesis
 
IRJET- An Efficient and Low Power Sram Testing using Clock Gating
IRJET-  	  An Efficient and Low Power Sram Testing using Clock GatingIRJET-  	  An Efficient and Low Power Sram Testing using Clock Gating
IRJET- An Efficient and Low Power Sram Testing using Clock Gating
 
Understanding cts log_messages
Understanding cts log_messagesUnderstanding cts log_messages
Understanding cts log_messages
 
An fpga based efficient fruit recognition system using minimum
An fpga based efficient fruit recognition system using minimumAn fpga based efficient fruit recognition system using minimum
An fpga based efficient fruit recognition system using minimum
 
Gv2512441247
Gv2512441247Gv2512441247
Gv2512441247
 
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...
 
Major project iii 3
Major project  iii  3Major project  iii  3
Major project iii 3
 
Vlsi titles
Vlsi titlesVlsi titles
Vlsi titles
 

Similar to Clock mesh sizing slides

Silicon Photonics and Photonic NoCs: A Survey
Silicon Photonics and Photonic NoCs: A SurveySilicon Photonics and Photonic NoCs: A Survey
Silicon Photonics and Photonic NoCs: A Surveydrv11291
 
Low Power Threshold Logic Designing Approach for High Energy Efficient Flip-Flop
Low Power Threshold Logic Designing Approach for High Energy Efficient Flip-FlopLow Power Threshold Logic Designing Approach for High Energy Efficient Flip-Flop
Low Power Threshold Logic Designing Approach for High Energy Efficient Flip-FlopAssociate Professor in VSB Coimbatore
 
Samtec whitepaper
Samtec whitepaperSamtec whitepaper
Samtec whitepaperjohn_111
 
Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...
Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...
Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...Piero Belforte
 
DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)
DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)
DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)Piero Belforte
 
Three Dimensional Modelling of MISFET
Three Dimensional Modelling of MISFETThree Dimensional Modelling of MISFET
Three Dimensional Modelling of MISFETIJERA Editor
 
Precision based data aggregation to extend life of wsn
Precision based data aggregation to extend life of wsnPrecision based data aggregation to extend life of wsn
Precision based data aggregation to extend life of wsnGaurang Rathod
 
Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...
Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...
Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...idescitation
 
01255490crosstalk noise
01255490crosstalk noise01255490crosstalk noise
01255490crosstalk noisesandeep patil
 
01255490crosstalk noise
01255490crosstalk noise01255490crosstalk noise
01255490crosstalk noisesandeep patil
 
An improved ant colony optimization algorithm for wire optimization
An improved ant colony optimization algorithm for wire optimizationAn improved ant colony optimization algorithm for wire optimization
An improved ant colony optimization algorithm for wire optimizationjournalBEEI
 
Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...
Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...
Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...Achintya Kumar
 
Design of Three-Input XOR/XNOR using Systematic Cell Design Methodology
Design of Three-Input XOR/XNOR using Systematic Cell Design MethodologyDesign of Three-Input XOR/XNOR using Systematic Cell Design Methodology
Design of Three-Input XOR/XNOR using Systematic Cell Design MethodologyAssociate Professor in VSB Coimbatore
 
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...TrungKienVu3
 
An approach to Measure Transition Density of Binary Sequences for X-filling b...
An approach to Measure Transition Density of Binary Sequences for X-filling b...An approach to Measure Transition Density of Binary Sequences for X-filling b...
An approach to Measure Transition Density of Binary Sequences for X-filling b...IJECEIAES
 

Similar to Clock mesh sizing slides (20)

Silicon Photonics and Photonic NoCs: A Survey
Silicon Photonics and Photonic NoCs: A SurveySilicon Photonics and Photonic NoCs: A Survey
Silicon Photonics and Photonic NoCs: A Survey
 
Low Power Threshold Logic Designing Approach for High Energy Efficient Flip-Flop
Low Power Threshold Logic Designing Approach for High Energy Efficient Flip-FlopLow Power Threshold Logic Designing Approach for High Energy Efficient Flip-Flop
Low Power Threshold Logic Designing Approach for High Energy Efficient Flip-Flop
 
Samtec whitepaper
Samtec whitepaperSamtec whitepaper
Samtec whitepaper
 
Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...
Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...
Digital Wave Formulation of Quasi-Static Partial Element Equivalent Circuit M...
 
DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)
DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)
DIGITAL WAVE FORMULATION OF PEEC METHOD (SLIDES)
 
Three Dimensional Modelling of MISFET
Three Dimensional Modelling of MISFETThree Dimensional Modelling of MISFET
Three Dimensional Modelling of MISFET
 
Precision based data aggregation to extend life of wsn
Precision based data aggregation to extend life of wsnPrecision based data aggregation to extend life of wsn
Precision based data aggregation to extend life of wsn
 
Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...
Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...
Energy Efficient Power Failure Diagonisis For Wireless Network Using Random G...
 
WCNC
WCNCWCNC
WCNC
 
01255490crosstalk noise
01255490crosstalk noise01255490crosstalk noise
01255490crosstalk noise
 
01255490crosstalk noise
01255490crosstalk noise01255490crosstalk noise
01255490crosstalk noise
 
An improved ant colony optimization algorithm for wire optimization
An improved ant colony optimization algorithm for wire optimizationAn improved ant colony optimization algorithm for wire optimization
An improved ant colony optimization algorithm for wire optimization
 
Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...
Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...
Performance Analysis of Encoder in Different Logic Techniques for High-Speed ...
 
Design of Three-Input XOR/XNOR using Systematic Cell Design Methodology
Design of Three-Input XOR/XNOR using Systematic Cell Design MethodologyDesign of Three-Input XOR/XNOR using Systematic Cell Design Methodology
Design of Three-Input XOR/XNOR using Systematic Cell Design Methodology
 
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
 
wireless power transfer
wireless power transferwireless power transfer
wireless power transfer
 
Bl34395398
Bl34395398Bl34395398
Bl34395398
 
E04503052056
E04503052056E04503052056
E04503052056
 
An approach to Measure Transition Density of Binary Sequences for X-filling b...
An approach to Measure Transition Density of Binary Sequences for X-filling b...An approach to Measure Transition Density of Binary Sequences for X-filling b...
An approach to Measure Transition Density of Binary Sequences for X-filling b...
 
International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),
 

More from Rajesh M

Daily Habits.pdf
Daily Habits.pdfDaily Habits.pdf
Daily Habits.pdfRajesh M
 
Clock relationships
Clock relationshipsClock relationships
Clock relationshipsRajesh M
 
Node Scaling Objectives
Node Scaling ObjectivesNode Scaling Objectives
Node Scaling ObjectivesRajesh M
 
Technology scaling introduction
Technology scaling introductionTechnology scaling introduction
Technology scaling introductionRajesh M
 
Problems between Synthesis and preCTS
Problems between Synthesis and preCTSProblems between Synthesis and preCTS
Problems between Synthesis and preCTSRajesh M
 
Setup fixing
Setup fixingSetup fixing
Setup fixingRajesh M
 
Vlsi best notes google docs
Vlsi best notes   google docsVlsi best notes   google docs
Vlsi best notes google docsRajesh M
 
#50 ethics
#50 ethics#50 ethics
#50 ethicsRajesh M
 
Power Reduction Techniques
Power Reduction TechniquesPower Reduction Techniques
Power Reduction TechniquesRajesh M
 
680report final
680report final680report final
680report finalRajesh M
 
Vlsi interview questions compilation
Vlsi interview questions compilationVlsi interview questions compilation
Vlsi interview questions compilationRajesh M
 

More from Rajesh M (12)

Daily Habits.pdf
Daily Habits.pdfDaily Habits.pdf
Daily Habits.pdf
 
Clock relationships
Clock relationshipsClock relationships
Clock relationships
 
Node Scaling Objectives
Node Scaling ObjectivesNode Scaling Objectives
Node Scaling Objectives
 
Technology scaling introduction
Technology scaling introductionTechnology scaling introduction
Technology scaling introduction
 
Problems between Synthesis and preCTS
Problems between Synthesis and preCTSProblems between Synthesis and preCTS
Problems between Synthesis and preCTS
 
Setup fixing
Setup fixingSetup fixing
Setup fixing
 
Vlsi best notes google docs
Vlsi best notes   google docsVlsi best notes   google docs
Vlsi best notes google docs
 
#50 ethics
#50 ethics#50 ethics
#50 ethics
 
Power Reduction Techniques
Power Reduction TechniquesPower Reduction Techniques
Power Reduction Techniques
 
680report final
680report final680report final
680report final
 
Vlsi interview questions compilation
Vlsi interview questions compilationVlsi interview questions compilation
Vlsi interview questions compilation
 
Eco
EcoEco
Eco
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Clock mesh sizing slides

  • 1. 3/18/2010ISPD 20101 Accurate Clock Mesh Sizing via Sequential Quadratic Programming Venkata Rajesh Mekala,Yifang Liu, XiaojiYe, Jiang Hu, Peng Li Department of ECE Texas A&M University
  • 2. OUTLINE 3/18/2010ISPD 20102  Introduction  PreviousWorks  Problem Formulation  Algorithm Overview  Results  Conclusions
  • 3. Clock source Flip flops Local trees Clock Architectures Clock Tree • low cost (wiring, power, cap) • higher skew, jitter than mesh • widely used in ASIC designs • clock gating easy to incorporate Flip-flops Flip flops tree crosslink crosslink flip flops Clock source Hybrid: tree + cross-links • low cost (wiring, power, cap) • smaller skew, jitter than tree* • difficult to analyze Hybrid: mesh + local trees Clock Mesh • excellent for low skew, jitter • high power, area, capacitance • difficult to analyze • clock gating not easy • used in modern processors 3/18/20103 ISPD 2010
  • 4. Clock Mesh 3/18/2010ISPD 20104  Clock mesh architecture is very effective in reducing skew variation.  Clock mesh is difficult in analyzing with sufficient accuracy.  It dissipates higher power compared to other architectures.  The challenge is to design the mesh with less power meeting the skew constraints.
  • 5. 3/18/2010ISPD 20105 Clock Distribution Networks ClockTrees Crosslinks Clock Mesh Pullela, Menezes and Pileggi Moment-sensitivity-based wire sizing for skew reduction 1997 Guthaus, Sylvester and Brown Clock buffer and wire sizing using sequential programming 2006 Wang, Ran, Jiang and Sadowska General skew constrained clock network sizing based on sequential linear programming 2005 Rajaram, Hu and Mahapatra Reducing clock skew variability via crosslinks 2006 Samanta, Hu and Li Discrete buffer and wire sizing for link-based non-tree clock networks 2008 Desai, Cvijetic and Jensen Sizing of clock distribution networks for high performance CPU chips 1996 Rajaram and Pan MeshWorks: An efficient framework for planning, synthesis and optimization of clock mesh networks 2008 Venkataraman, Feng, Hu and Li Combinatorial algorithms for fast clock mesh optimization 2006
  • 6. Motivation & Our Contributions 3/18/2010ISPD 20106  Current-source based gate modeling approach to speedup the accurate analysis of clock mesh.  Efficient adjoint sensitivity analysis to provide desirable sensitivities.  Algorithm based on rigorous SQP.  First clock mesh sizing method that does systematic solution search and is based on accurate delay model
  • 7. Problem Formulation 3/18/2010Texas A&M University7 I is the set of interconnects in the clock mesh xi ; i Є I is the width of element i in the interconnect set wi ; i Є I is the area of element i in the interconnect set S is the set of sinks or local trees dj ; j Є S the propagation delay of the signal from the root of the clock tree to sink j D is the coefficient vector reflecting the linear size-area relation µ is the average value of the sink delays and δ is the given maximum variance Lx and Ux represent the lower bound and upper bound vectors of the wires
  • 8. Problem Formulation 3/18/2010ISPD 20108 total clock mesh area skew constraint in the variance form lower bound, upper bound vectors of the wire widths Higher wire area leads to a higher load capacitance for the clock buffers which in turn implies a higher power dissipation. Constraint in the quadratic form is a differentiable function
  • 9. Solving the Problem 3/18/2010ISPD 20109  Lagrangian of the original problem:  Gradient vector of the Lagrangian function is be obtained by circuit simulation and adjoint sensitivity analysis
  • 10. Solving the Problem 3/18/2010ISPD 201010  Lagrangian of the original problem:  Gradient vector of the Lagrangian function The adjoint sensitivity analysis gives us the values of
  • 11. Solving the Problem 3/18/2010ISPD 201011  Lagrangian of the original problem:  Gradient vector of the Lagrangian function The sensitivities with respect to wire widths are calculated with the help of chain rule:
  • 12. Solving the Problem 3/18/2010ISPD 201012  Lagrangian of the original problem:  Gradient vector of the Lagrangian function  Necessary conditions for any optimal point of the problem – KKT conditions Common way to solve this equation is by Newton’s method.
  • 13. Solving the Problem 3/18/2010ISPD 201013  Let the Newton step in iteration k of solving the equation be: x, λ are variables in the equation. px,k and pλ,k are the vectors representing change in width of wires and Lagrangian multiplier.
  • 14. Solving the Problem 3/18/2010ISPD 201014  Let the Newton step in iteration k of solving the equation be:  Jacobian of the equation is:  Hessian of the Lagrangian function:  Newton step calculation implies that px,k and pλ,k satisfy the following system:
  • 15. Solving the Problem 3/18/2010ISPD 201015  Newton step calculation implies that px,k and pλ,k satisfy the following system:  Adjusting the above equation gives us:  This equation is solved by:  Minimize:  Subject to:
  • 16. Solving the QP sub-problem 3/18/2010ISPD 201016  The QP sub-problem to be solved as a part of SQP is: Minimize: Subject to: and
  • 17. Solving the QP sub-problem 3/18/2010ISPD 201017  The QP sub-problem to be solved as a part of SQP is: Minimize: Subject to: and through sensitivity analysis we obtain the gradient. the sensitivities with respect to wire widths are calculated with the help of chain rule:
  • 18. Solving the QP sub-problem 3/18/2010ISPD 201018  The QP sub-problem to be solved as a part of SQP is: Minimize: Subject to: and we use quasi-newton (BFGS ) method to approximate the hessian in each iteration
  • 19. Sensitivity Analysis 3/18/2010ISPD 201019  Sensitivity information of the original circuit obtained by convolution-like computation between transient waveforms of the original and the adjoint circuit.  Compact gate model provides up to two orders of magnitude speedup over SPICE simulation while maintaining the same level of accuracy. P. Li, Z. Feng and E.Acar.“Characterizing multistage nonlinear drivers and variability for accurate timing and noise analysis". In IEEE Trans.Very Large Scale Integration, pp 205 - 214, November 2007. X.Ye and P. Li. “An application-specic adjoint sensitivity analysis framework for clock mesh sensitivity computation". In Proc. of IEEE International Symposium on Quality Electronic Design, pp 634 - 640, 2009.
  • 20. CMSSQP Framework 3/18/2010ISPD 201020 Initialization of the design (No. of buffers, benchmark and clock mesh) Generate spice netlist Sensitivity Analysis (Sensitivities of the 𝜎2 with respect to wire widths) Quasi-Newton approximation of Hessian Optimization Formulate and Solve the Quadratic Programming sub-problem Update the widths of the clock mesh Transient Simulation (Compute the delays, slew to every sink node) Convergence criterion met?STOP C++ MOSEK SPICE YES NO SPICE
  • 21. Results 3/18/2010ISPD 201021  Experimental Setup  65nm technology transistor models for the buffers  (m rows X n columns) mesh  Max skew  Linux platform having two Intel Xeon E5410 quad-cores  ISCAS, ISPD benchmarks  Widths limited
  • 22. Initial clock mesh design 3/18/2010ISPD 201022
  • 23. Results after executing CMSSQP 3/18/2010ISPD 201023
  • 24. Summary: Reduction in area 3/18/2010ISPD 201024
  • 25. Area-skew tradeoff by varying δ 3/18/2010ISPD 201025 ISPD: ispd09f11
  • 26. Case(a): (σ2 < δ), σ2 , total clock mesh area in each iteration 3/18/2010ISPD 201026
  • 27. Case(b): (σ2 > δ), σ2 , total clock mesh area in each iteration 3/18/2010ISPD 201027
  • 28. Conclusions & Future work 3/18/2010ISPD 201028  Presented an algorithm for reduction of clock mesh area satisfying specified skew constraints in a clock mesh.  Robust in dealing with any complex clock mesh network.  First clock mesh sizing method that does systematic solution search and is based on accurate delay model.  Experimental results achieved about 33% reduction in clock mesh area.  Can be extended to size interconnects, mesh buffers simultaneously.