Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clockgate: centralize clock enables out of FFs #4583

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

widlarizer
Copy link
Collaborator

@widlarizer widlarizer commented Sep 5, 2024

What are the reasons/motivation for this change?

According to several users, large power savings in ASICs can be achieved by centralizing clock gating per clock domain rather than routing CLK and EN into many FFs.

Explain how this is achieved.

The new clock_gate command allows the user to specify a rising edge clock gating cell ("ICG") with -pos and a falling one with -neg. Within each selected module, flip flops driven with the same clock and enable signals of the same polarities are grouped. If this group has more flops than the threshold set by -min_net_size N, they are converted via kernel/ff.h to versions withouth clock enable, and corrsepondingly, clock gating cells are emitted to create one GCLK net per flop group.

Since clock gates often have DFT ports, those are assumed to be required tied low. They can be listed with repeated -tie_lo pass args.

It's also assumed that ICGs have active-high CE pins. This is based on the fact that there doesn't seem to be a provision in the liberty file format to describe that polarity either way. This pass may in the future be adapted to search for usable cells in a provided .lib file.

clock_gate -pos pdk_icg ce:clkin:clkout -tie_lo scanen

If applicable, please suggest to reviewers how they can test the change.

Run the included test which doesn't test anything but shows off the design. Use -min_net_size to show that you can eliminate small cases.

  • help string
  • proper test

@widlarizer widlarizer changed the title clock_gate: prototype with demo clock_gate: centralize clock enables out of FFs Sep 5, 2024
passes/techmap/clock_gate.cc Outdated Show resolved Hide resolved
passes/techmap/clock_gate.cc Outdated Show resolved Hide resolved
passes/techmap/clock_gate.cc Outdated Show resolved Hide resolved
passes/techmap/clock_gate.cc Outdated Show resolved Hide resolved
passes/techmap/clock_gate.cc Outdated Show resolved Hide resolved
passes/techmap/clock_gate.cc Outdated Show resolved Hide resolved
passes/techmap/clock_gate.cc Outdated Show resolved Hide resolved
@mole99
Copy link

mole99 commented Sep 6, 2024

Hi @widlarizer, this looks very interesting! Thanks for your work 🙌
Does this also work with FFs that don't have a CE input, but rather a MUX2 on the D input?
That would be similar in functionality to what Lighter currently does and could potentially replace it: https://github.com/AUCOHL/Lighter

@povik
Copy link
Member

povik commented Sep 6, 2024

Does this also work with FFs that don't have a CE input, but rather a MUX2 on the D input?

Those should be converted to an FF with a clock enable input by the application of opt_dff earlier in the flow.

@mole99
Copy link

mole99 commented Sep 6, 2024

That's great, thanks!

@povik
Copy link
Member

povik commented Sep 6, 2024

Sure thing!

@widlarizer widlarizer marked this pull request as ready for review September 9, 2024 13:03
@widlarizer widlarizer changed the title clock_gate: centralize clock enables out of FFs clockgate: centralize clock enables out of FFs Sep 9, 2024
passes/techmap/clockgate.cc Outdated Show resolved Hide resolved
@widlarizer
Copy link
Collaborator Author

widlarizer commented Sep 12, 2024

Related work:

  • Lighter - Lighter is very simple. Using it for a different PDK requires rewriting the verilog techmapping file. It indiscriminately extracts clock gating out of each clock enabled flop. I assume it then deduplicates them with some opt command, but this can worsen area if you have many enable and clk pairs that are each only used by one FF cell with clock enable. This pass at the moment needs PDK-specific flags, which is annoying. It can be extended in the future to find correct ICGs in a Liberty file.
  • IMMS clock gating with yosys - adds clock gating to ungated cells

Copy link
Member

@povik povik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, there's a nit with setting the tie-low port and some rephrasing suggestions

Comment on lines 47 to 50
log("Creates gated clock nets for sets of FFs with clock enable\n");
log("sharing a clock and replaces the FFs with versions without\n");
log("clock enable inputs. Intended to reduce power consumption\n");
log("in ASIC designs.\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log("Creates gated clock nets for sets of FFs with clock enable\n");
log("sharing a clock and replaces the FFs with versions without\n");
log("clock enable inputs. Intended to reduce power consumption\n");
log("in ASIC designs.\n");
log("This pass transforms a set of FFs sharing the same clock and\n");
log("enable signal into a set of enable-less FFs and a clock gating\n");
log("cell. This is primarily a power-saving transformation on ASIC\n");
log("designs.\n");

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does many sets at once, typically. Hence the plural, even if it's awkward, is more correct. It's good that you found a way to add mentioning the ICG, I'll def adapt that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case let me suggest "This pass transforms each set of FFs"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapted

passes/techmap/clockgate.cc Outdated Show resolved Hide resolved
passes/techmap/clockgate.cc Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants