Advanced Floorplanning and Clock Tree Techniques For Handling Large Regular Structures Paul Dudek – Sr. Physical Design Engr. J. Bhasker - Architect eSilicon Corporation
Introduction Some complex and timing critical chip layout designs require user manual guidance in order to workaround tool limitations. This presentation will provide a few examples of chip layout challenges and type of solutions that have been used at eSilicon.
Buffer/FF Stages Between Hierarchical Blocks . . . . . . Center block Hierarchical block pins were distributed and manually adjusted to later drive custom routing and repeater buffer and FF cells (inside the channels between the blocks) placement .
Buffer/FF Stages Between Hierarchical Blocks input _port input _port . . . outport . . . outport Repeater buffers were driving hundreds of signals from left-to-right, right-to-left, top-to-bottom and bottom-to-top and needed to be placed at exact locations.
Buffer/FF Stages Between Hierarchical Blocks The buffer/flop repeater stages were grouped and made into temporary hierarchical blocks. Hierarchical blocks Temporary hierarchical block (flattened after floorplanning stage) Custom routing from/to The repeater stages.
Buffer/FF Stages Between Hierarchical Blocks Actual view of the repeater cells within a single Xbar column.
Main benefits of creating temporary hierarchical blocks vs. creating fenced cell regions:
Ease of adding custom routing which was based on the
hierarchical and temporary block pin locations.
It would have been difficult to add routing based on cells
“ randomly” clustered in a fenced region.
2. Ease of placing cells at an exact location inside the
temporary blocks. A separate script was written to place
the cells in a stepping pattern to prevent overlaps, with
placement based on the block pin locations.
Buffer/FF Stages Between Hierarchical Blocks Repeater cells for vertical routing Repeater cells for horizontal routing Hierarchical blocks Final view of the flattened temporary blocks repeater stages.
M9 custom vertical signal routes M8 custom horizontal signal routes ‘ Center’ hierarchical blocks All custom critical routing was done in thick M8 and M9 above the hierarchical blocks. The blocks had M7 PG mesh with max, 70% util. Critical Routing - ‘thick’ M8/M9
Xbar custom routing corner view M9 custom vertical signal routes M8 custom horizontal signal routes Critical Routing
M8/M9 ‘thick metal’ VSS/VDD mesh M8/M9 routing tracks (or routing grids) Note the M8/M9 PG mesh was built in such way that only 1 signal route could be routed in-between, resulting in automatic pre-built shielding for all critical routes. Critical Routing - shielding
Resistance of thick M8/M9 layers was about 6x less than
for other layers allowing “strong” buffers to drive them
over long distances without slew degradation.
With M8/M9 PG mesh in every other track, the custom
routes were automatically shielded – by construction.
Small hierarchical blocks were easily built with six
signal layers plus M7 PG max mesh coverage, thus
eliminating any xtalk with M8/M9 custom routes from above.
Clock “sync” cells attempting to balance the clock tree skew made the skew worse. This was due to the narrow channels between the xbar hierarchical blocks. Clock Tree Clock tree skew balance cells Existing repeater cells
Clock Tree Custom clock tree was built using a script placing clock inverter cells up to 64 quadrants, after which automatic CTS was run to tool built the remaining tree.
Clock Tree Same as previous picture, but showing all the hierarchical blocks.
Clock Tree Example of one of the 64 quadrants. Hierarchical blocks “ Flylines” showing target connections Clock driver cell for blocks’ clock pins
Clock Tree Example of one of the 64 quadrants. Hierarchical blocks “ Flylines” showing target connections Clock driver cell for FF cells
Clock Tree Clock tree – 1st attempt. (showing flylines to the final targets) Each quadrant was divided equally, where the channels were “split” in the center assigning half the FFs to one quadrant and the other half to the other. Tool was unable to balance the skew between the quadrants.
Clock Tree Clock tree – 2nd attempt. The channel FF cells “split” in the center was removed and “ whole” channels were assign to a given quadrant. Skew has improved but the tool was still unable to balance the skew between the quadrants.
Clock Tree Clock tree – 3rd attempt. The custom H-tree was moved to minimize wire length to the final targets. Tool was able to provide local skew of 120p and global skew of 180p. Target skew was 200 ps.
Magma tool set made it possible to complete this complex design in about 8 month period from initial netlist handoff to tapeout.
Magma straightforward TCL database access greatly simplified custom script implementation.
One can guide the tool with critical cell placement and routing to work around tool limitations.
Conclusion Complex design structures such as a crossbar require user guidance and ability to manipulate layout database structure. Scripting allows quick update of layout database for each incremental revision of the netlist.