Home » PowerCentric

PowerCentric

Overview

PowerCentric™ is a clock tree synthesis (CTS) and gate level clock gating tool. It builds skew balanced clock trees that are up to 30% smaller and 20% lower power than competing CTS tools. PowerCentric clock trees also have up to 30% less clock insertion delay and 30% less clock skew. PowerCentric achieves these significant benefits through a core technology innovation called Directed Balancing.

Flow

PowerCentric functions as a plug-in point tool within a digital ASIC design flow, fully replacing the CTS step inside a Place and Route tool. PowerCentric integrates with any Place and Route tool using industry standard file formats as its inputs and outputs. Both PowerCentric and Azuro's other product Rubix™ share exactly the same flow interface and scripting language: if one tool is integrated into a design flow then both tools are integrated into that design flow.

Technology

The traditional role of clock tree synthesis is to distribute a set of source clock signals to thousands of data registers on a chip such that these signals arrive at almost exactly the same time, within some tight skew margin. Along the way clock signals may be muxed, gated, divided, or forked, merged, and stop-started in many other ways.

Clock tree synthesis tools typically operate under the hood in two phases, usually referred to as clustering and skew balancing. During the clustering phase, the CTS tool performs either a combination of recursive grouping from the registers upwards to the clock source, or recursive sub-division from the clock source downwards to the registers. Clustering creates a tree structure which distributes clock signals within the physical floorplan and clock slew constraints of a design, but does not guarantee that clock signals arrive at all registers at the same time. After clustering, the skew balancing phase makes several additional top-down or bottom-up passes over the clustered clock tree to either delay or speed up paths to certain registers. The goal for this phase is to make sure that the clock signals arrive approximately at the same time at every register, to within the target skew margin.

Clustering and skew balancing both suffer from the same fundamental problem: they are local, not global. Making a series of local changes in the hope that the right global solution will emerge is like trying to find the fastest route between two towns simply by following a bearing on a compass rather than looking at a map. As the road layout gets more complex, with more one-way streets, highways, and restricted intersections, the map becomes invaluable if you want to get to your destination as fast as possible; similarly, as chips get more complex, with more aggressive clock gating and using more advanced process nodes, it becomes increasingly important for the CTS tool to have global knowledge of the problem it is trying to solve.

PowerCentric exploits a fundamentally new approach to clock tree synthesis called Directed Balancing. Directed Balancing represents the skew balancing problem as a massive system of mathematical equations. The variables in these equations are the delays between all the different functional points in the clock network, such as clock muxes, clock gates, clock dividers, and registers. Physical constraints such as the chip floorplan, available buffer and inverter cell sizes, and clock slew targets are encoded as one set of equations. For each chip operating mode, timing constraints are translated into another set of equations that require certain sets of variables be matched with other sets of variables. Trade-offs between clock power, area, and insertion delay can also be represented in the system by yet more equations.

Once the system of equations is constructed, it can be globally solved, and this global solution used as the "map" for traditional clustering and skew balancing methods. In the same way that a map helps you avoid a wrong turn on the highway, Directed Balancing gives PowerCentric global knowledge, and so it can avoid the mistakes that a purely local solution would make. The clock trees PowerCentric builds are therefore smaller, lower power, and faster by a significant margin. The more complex a chip, the more advanced the process node, and the more aggressively clock gated a design is, the bigger the benefits become.

Downloads

To learn more about Azuro's products and technology please visit our Downloads page.