Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intern Width #4242

Merged
merged 1 commit into from
Jul 8, 2024
Merged

Intern Width #4242

merged 1 commit into from
Jul 8, 2024

Conversation

jackkoenig
Copy link
Contributor

API modification to improve performance. This could break code so I'm just marking this 7.0, fixing uses is easy and it's worth it.

Applying the same technique as used in Scala's BigInt (previously applied in SFC to great effect).

Using same benchmarking approach as in,

import chisel3._
// _root_ disambiguates from package chisel3.util.circt if user imports chisel3.util._
import _root_.circt.stage.ChiselStage

class MyBundle extends Bundle {
  val a, b, c, d, e, f, g = UInt(8.W)
}

class Foo(n: Int) extends Module {
  val in = IO(Input(Vec(n, new MyBundle)))
  val out = IO(Output(Vec(n, new MyBundle)))

  out :#= in
}

object Main extends App {
  val n = args(0).toInt
  val phase = new chisel3.stage.phases.Elaborate
  val annos = Seq(
    chisel3.stage.ChiselGeneratorAnnotation(() => new Foo(n))
  )
  println(phase.transform(annos).size)
}

Running:

./firrtl/benchmark/scripts/find_heap_bound.py -vvv --start-size 4G --min-step 10M --context 5 -- -cp assembly.jar Main 200000

Master:

Xmx Max RSS (MiB) Wall Clock (s) User Time (s)
1G 1317 6.73 46.25
880M 1155 7.26 78.42
870M 1151 8.39 79.62
860M 1177 11.45 129.8
850M - - -

This branch:

Xmx Max RSS (MiB) Wall Clock (s) User Time (s)
1G 1321 6.4 60.91
830M 1126 6.87 59.81
820M 1117 9.38 96.9
810M 1154 20.24 308.01
800M - - -

This benchmark isn't necessarily the best benchmark since all widths are shared (I only use 8.W), however, the full width cache Array is still allocated, so we sort of also pay the "worst cost". The only way that this wouldn't be a benefit is if a very high percentage of widths are outside of the memoized range.

Measured a 3% memory reduction (on this benchmark) and probably a very slight speedup too.

Contributor Checklist

  • Did you add Scaladoc to every public function/method?
  • Did you add at least one test demonstrating the PR?
  • Did you delete any extraneous printlns/debugging code?
  • Did you specify the type of improvement?
  • Did you add appropriate documentation in docs/src?
  • Did you request a desired merge strategy?
  • Did you add text to be included in the Release Notes for this change?

Type of Improvement

  • API modification
  • Performance improvement

Desired Merge Strategy

  • Squash

Release Notes

  • UnknownWidth becomes a case object (Drop () when using it).
  • KnownWidths 0-1024 are interned

Reviewer Checklist (only modified by reviewer)

  • Did you add the appropriate labels? (Select the most appropriate one based on the "Type of Improvement")
  • Did you mark the proper milestone (Bug fix: 3.6.x, 5.x, or 6.x depending on impact, API modification or big change: 7.0)?
  • Did you review?
  • Did you check whether all relevant Contributor checkboxes have been checked?
  • Did you do one of the following when ready to merge:
    • Squash: You/ the contributor Enable auto-merge (squash), clean up the commit message, and label with Please Merge.
    • Merge: Ensure that contributor has cleaned up their commit history, then merge with Create a merge commit.

@jackkoenig jackkoenig added this to the 7.0 milestone Jul 2, 2024
* UnknownWidth becomes a case object
* KnownWidths 0-1024 are interned
@jackkoenig jackkoenig merged commit d1e9cb9 into main Jul 8, 2024
15 checks passed
@jackkoenig jackkoenig deleted the intern-width branch July 8, 2024 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants