Confused by fork.withRegion(Monitor) #630

mvanbeirendonck · 2023-04-25T10:24:35Z

Hi,

I've been experimenting with the fork/join capabilities to test pipelined modules. I am somewhat confused about the synchronization between threads.

To illustrate, I created a simple pipelined multiplier:

The pipeline balancing of input b (delay-1 input registers) can also be disabled to be done externally.

import chisel3._
import chisel3.util._

import chiseltest._
import org.scalatest.flatspec.AnyFlatSpec

class Mul(val delay: Int = 64, val shouldDelayInputB: Boolean = true)
    extends Module() {

  val io = IO(new Bundle {
    val in = Input(new Bundle {
      val a = UInt(32.W)
      val b = UInt(32.W)
    })
    val out = Output(UInt(32.W))
  })

  val a_d = ShiftRegister(io.in.a, delay - 1)
  val b_d =
    if (shouldDelayInputB) ShiftRegister(io.in.b, delay - 1) else io.in.b

  io.out := RegNext(a_d * b_d)
}

I've then created three tests with fork. I've found that test2 and test3 fail. The forked pokes on input b are not showing up correctly.

class MulTest extends AnyFlatSpec with ChiselScalatestTester {
  behavior.of("Mul")

  val test1 = it should "Mul without Input B delayed inside the module" in {
    test(new Mul(shouldDelayInputB = true))
      .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
        fork {
          dut.io.in.a.poke(3.U)
          dut.io.in.b.poke(4.U)
          dut.clock.step(1)
          dut.io.in.a.poke(7.U)
          dut.io.in.b.poke(8.U)
          dut.clock.step(1)
        }
        dut.clock.step(dut.delay)
        dut.io.out.expect(3 * 4)
        dut.clock.step(1)
        dut.io.out.expect(7 * 8)
      }
  }

  // This fails both expects
  val test2 = it should "Mul without Input B delayed inside the testbench" in {
    test(new Mul(shouldDelayInputB = false))
      .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
        val f = fork {
          dut.io.in.a.poke(3.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(4.U)
          }
          dut.clock.step(1)
          dut.io.in.a.poke(7.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(8.U)
          }
          dut.clock.step(1)
        }
        // f.fork.withRegion(Monitor) {
        dut.clock.step(dut.delay)
        dut.io.out.expect(3 * 4)
        dut.clock.step(1)
        dut.io.out.expect(7 * 8)
      // }
      }
  }

  // This fails the second expect
  val test3 =
    it should "Mul without Input B delayed inside the testbench and separate forks" in {
      test(new Mul(shouldDelayInputB = false))
        .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
          val f1 = fork {
            dut.io.in.a.poke(3.U)
            dut.clock.step(1)
            dut.io.in.a.poke(7.U)
            dut.clock.step(1)
          }
          val f2 = fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(4.U)
            dut.clock.step(1)
            dut.io.in.b.poke(8.U)
          }
          // f2.fork.withRegion(Monitor) {
          dut.clock.step(dut.delay)
          dut.io.out.expect(3 * 4)
          dut.clock.step(1)
          dut.io.out.expect(7 * 8)
        // }
        }
    }
}

In both cases, the solution is to add the commented fork.withRegion(Monitor) calls. However, I don't understand why that is necessary here.

From RegionsTests.scala, I had the impression that fork.withRegion(Monitor) resolved combinatorial read-after-write dependencies for the threads. Here, even with the pipeline balancing of b done within the testbench, the register io.out := RegNext(a_d * b_d) still prevents such dependencies. I must have misunderstood what fork.withRegion(Monitor) is doing precisely. Are there any pointers you could give here?

The text was updated successfully, but these errors were encountered:

mvanbeirendonck · 2023-04-25T12:18:56Z

Update: I forgot the joinAndStep(dut.clock). The fork.withRegion was not fixing anything; it just ensured the testbench finished without executing the expect calls.

I have also found the actual issue. I was simply not adding the extra dut.clock.step(1) within the forked pokes for input b. It seems that forks have some default timescope behavior? Without stepping the clock, the poke is reverted?

The fix for test3 is here. Trying something similar for test2:

  val test2 = it should "Mul without Input B delayed inside the testbench" in {
    test(new Mul(shouldDelayInputB = false))
      .withAnnotations(Seq(WriteVcdAnnotation)) { dut =>
        val f = fork {
          dut.io.in.a.poke(3.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(4.U)
            dut.clock.step(1)
          }
          dut.clock.step(1)
          dut.io.in.a.poke(7.U)
          fork {
            dut.clock.step(dut.delay - 1)
            dut.io.in.b.poke(8.U)
            dut.clock.step(1)
          }
          dut.clock.step(1)
        }

        dut.clock.step(dut.delay)
        dut.io.out.expect(3 * 4)
        dut.clock.step(1)
        dut.io.out.expect(7 * 8)
      }
  }

results in chiseltest.ThreadOrderDependentException: Mix of ending timescopes and old timescopes. Is there any solution for that?

ekiwi assigned ducky64 Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confused by fork.withRegion(Monitor) #630

Confused by fork.withRegion(Monitor) #630

mvanbeirendonck commented Apr 25, 2023

mvanbeirendonck commented Apr 25, 2023

Confused by fork.withRegion(Monitor) #630

Confused by fork.withRegion(Monitor) #630

Comments

mvanbeirendonck commented Apr 25, 2023

mvanbeirendonck commented Apr 25, 2023