Skip to content
This repository has been archived by the owner on Aug 19, 2024. It is now read-only.

Confused by fork.withRegion(Monitor) #630

Open
mvanbeirendonck opened this issue Apr 25, 2023 · 1 comment
Open

Confused by fork.withRegion(Monitor) #630

mvanbeirendonck opened this issue Apr 25, 2023 · 1 comment
Assignees

Comments

@mvanbeirendonck
Copy link

Hi,

I've been experimenting with the fork/join capabilities to test pipelined modules. I am somewhat confused about the synchronization between threads.

To illustrate, I created a simple pipelined multiplier:

image

The pipeline balancing of input b (delay-1 input registers) can also be disabled to be done externally.

import chisel3._
import chisel3.util._

import chiseltest._
import org.scalatest.flatspec.AnyFlatSpec

class Mul(val delay: Int = 64, val shouldDelayInputB: Boolean = true)
    extends Module() {

  val io = IO(new Bundle {
    val in = Input(new Bundle {
      val a = UInt(32.W)
      val b = UInt(32.W)
    })
    val out = Output(UInt(32.W))
  })

  val a_d = ShiftRegister(io.in.a, delay - 1)
  val b_d =
    if (shouldDelayInputB) ShiftRegister(io.in.b, delay - 1) else io.in.b

  io.out := RegNext(a_d * b_d)
}

I've then created three tests with fork. I've found that test2 and test3 fail. The forked pokes on input b are not showing up correctly.

class MulTest extends AnyFlatSpec with ChiselScalatestTester {
  behavior.of("Mul")

  val test1 = it should "Mul without Input B delayed inside the module" in {
    test(new Mul(shouldDelayInputB = true))
      .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
        fork {
          dut.io.in.a.poke(3.U)
          dut.io.in.b.poke(4.U)
          dut.clock.step(1)
          dut.io.in.a.poke(7.U)
          dut.io.in.b.poke(8.U)
          dut.clock.step(1)
        }
        dut.clock.step(dut.delay)
        dut.io.out.expect(3 * 4)
        dut.clock.step(1)
        dut.io.out.expect(7 * 8)
      }
  }

  // This fails both expects
  val test2 = it should "Mul without Input B delayed inside the testbench" in {
    test(new Mul(shouldDelayInputB = false))
      .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
        val f = fork {
          dut.io.in.a.poke(3.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(4.U)
          }
          dut.clock.step(1)
          dut.io.in.a.poke(7.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(8.U)
          }
          dut.clock.step(1)
        }
        // f.fork.withRegion(Monitor) {
        dut.clock.step(dut.delay)
        dut.io.out.expect(3 * 4)
        dut.clock.step(1)
        dut.io.out.expect(7 * 8)
      // }
      }
  }

  // This fails the second expect
  val test3 =
    it should "Mul without Input B delayed inside the testbench and separate forks" in {
      test(new Mul(shouldDelayInputB = false))
        .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
          val f1 = fork {
            dut.io.in.a.poke(3.U)
            dut.clock.step(1)
            dut.io.in.a.poke(7.U)
            dut.clock.step(1)
          }
          val f2 = fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(4.U)
            dut.clock.step(1)
            dut.io.in.b.poke(8.U)
          }
          // f2.fork.withRegion(Monitor) {
          dut.clock.step(dut.delay)
          dut.io.out.expect(3 * 4)
          dut.clock.step(1)
          dut.io.out.expect(7 * 8)
        // }
        }
    }
}

In both cases, the solution is to add the commented fork.withRegion(Monitor) calls. However, I don't understand why that is necessary here.

From RegionsTests.scala, I had the impression that fork.withRegion(Monitor) resolved combinatorial read-after-write dependencies for the threads. Here, even with the pipeline balancing of b done within the testbench, the register io.out := RegNext(a_d * b_d) still prevents such dependencies. I must have misunderstood what fork.withRegion(Monitor) is doing precisely. Are there any pointers you could give here?

@mvanbeirendonck
Copy link
Author

Update: I forgot the joinAndStep(dut.clock). The fork.withRegion was not fixing anything; it just ensured the testbench finished without executing the expect calls.

I have also found the actual issue. I was simply not adding the extra dut.clock.step(1) within the forked pokes for input b. It seems that forks have some default timescope behavior? Without stepping the clock, the poke is reverted?

The fix for test3 is here. Trying something similar for test2:

  val test2 = it should "Mul without Input B delayed inside the testbench" in {
    test(new Mul(shouldDelayInputB = false))
      .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
        val f = fork {
          dut.io.in.a.poke(3.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(4.U)
            dut.clock.step(1)
          }
          dut.clock.step(1)
          dut.io.in.a.poke(7.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(8.U)
            dut.clock.step(1)
          }
          dut.clock.step(1)
        }

        dut.clock.step(dut.delay)
        dut.io.out.expect(3 * 4)
        dut.clock.step(1)
        dut.io.out.expect(7 * 8)
      }
  }

results in chiseltest.ThreadOrderDependentException: Mix of ending timescopes and old timescopes. Is there any solution for that?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants