Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reset the DM on the first connection on both 0.13 and 1.0 spec versions #1021

Closed
en-sc opened this issue Apr 26, 2024 · 7 comments
Closed

Comments

@en-sc
Copy link
Contributor

en-sc commented Apr 26, 2024

  • Assume an external debugger (ED) is connecting to a hart for the first time.
  • ED wants to reset the DM in order to get it into a predictable state.
  • To reset the DM the ED needs to set dmcontrol.dmactive = 0.
  • However, the spec states [3.7. Abstract Commands]:

While an abstract command is executing ({abstractcs-busy} in {dm-abstractcs} is high), a debugger must not change {hartsel}...

The ED does not know whether an abstract command is executing, so it needs to read dmcontrol to know the value of dmcontrol.hartsel before writing dmcontrol = dmactive=0 | <the old hartsel value> (requesting DM reset without changing hartsel).

  • Assume the read of dmcontrol results in an error.
  • Since the ED does not know anything about the state of the DM yet, there are two possibilities:
    1. dmcontrol.dmactive is low and [3.14.2. Debug Module Control ]

    Any accesses to the module may fail.

    1. dmcontrol.dmactive is high and it's a legitimate DMI error (e.g. some issue with the connection).
  • There are two possible routes:
    1. ED assumes dmcontrol.dmactive is low and writes dmcontrol = <dmcontrol.dmactive=1> | <hartsel=0> -- may result in the change of hartel, so ED can't do that.
    2. ED reads dmcontrol again.

Now let's take into account RISC-V Debug Spec v0.13, since the ED does not yet know which version it is (dtmcs only distinguishes between 0.11 and (0.13 or 1.0)).
In versions less then 0.13.3, according to [1.2.1.2. Incompatible Changes from 0.13 to 1.0]:

  1. Bump version to 3. Adding version encoding for 0.14 spec. #512 , Require debugger to poll dmactive after lowering it. require polling of dmactive low as well as dmactive high transitions #566

The HW is not required to eventually return something meaningful when dmcontroll.dmactive was set to low. So the ED may wait forever in the loop, reading dmcontrol and getting an error on RISC-V Debug Spec v0.13 - compliant HW.

Can this somehow be fixed?

@JanMatCodasip
Copy link
Contributor

JanMatCodasip commented May 3, 2024

My proposal would be to specify the external debugger is permitted to perform dmactive <-- 0 at any time, without the need to retain any other dmcontrol bits, and regardless of the DM state -- so that the external debugger can bring the DM safely and reliably to a known state (but what happens to in-flight operations like abstract commands would be undefined).

I am inclined to believe that this was the intention of the debug spec, however strict interpretation of the spec text While an abstract command is executing (...), a debugger must not change {hartsel} (as @en-sc pointed out above) seem to contradict that.

@en-sc
Copy link
Contributor Author

en-sc commented May 3, 2024

My proposal would be to specify the external debugger is permitted to perform dmactive <-- 0 at any time, without the need to retain any other dmcontrol bits, and regardless of the DM state...

The thing is, there may be a RISC-V Debug Spec v0.13 implemetation that complies to this strict interpretation of While an abstract command is executing (...), a debugger must not change {hartsel}. Since the ED does not know whether the target is 0.13 or 1.0 compliant at this poin, it should go with the strictest spec of the two and still read dmcontrol.hartsel in order not to change it while writing dmactive to zero.

I would suggest specifying that while dmactive is low, read of dmcontrol should not fail (may result in a busy response). However, It maybe to strict of a requirement for the HW.

@JanMatCodasip
Copy link
Contributor

JanMatCodasip commented May 3, 2024

I maintain the opinion that

  • if a DM that is reset (via dmcontrol.dmactive <-- 0), and
  • at the same time change of hartsel bits by the external debugger disrupts the operation of the DM,

then the reset implementation in the DM can be considered defective, because in that case dmactive=0 does not clear the state of the debug module, as required by any version of the debug spec.

Since the ED does not know whether the target is 0.13 or 1.0 compliant at this poin, it should go with the strictest spec of the two (...)

My recommendation for the EDs would be to go with the most robust approach in this case - that is to perform the DM reset always. If the ED detects the case when there would be concern about the hartsel bits, the ED can warn the user and possibly provide a configuration option to the user to alter the behavior.

@en-sc
Copy link
Contributor Author

en-sc commented May 14, 2024

I'we created a flowchart that would hopefully help to better describe the issue.
dm-reset drawio

@rtwfroody
Copy link
Collaborator

  1. Asserting dmactive should always be allowed, and reset hartsel. dmactive is a reset signal.
  2. Even in 0.13, dmactive is listed as R/W, so writing 0 should result in the bit becoming 0 eventually. I don't see an issue with polling it and the bit never becoming 0. But maybe I misunderstood your point.
  3. If there is no DM at the bus address, then the value will always read 0 and the debugger will perceive it as dmactive never goes high. That's not ideal, but I think it's good enough.

en-sc added a commit to en-sc/riscv-debug-spec that referenced this issue May 17, 2024
… is high

With the clarification, it's obvious that the DM can be reset without
knowing the `hartsel`.

Fixes riscv#1021
@en-sc
Copy link
Contributor Author

en-sc commented May 17, 2024

  1. Asserting dmactive should always be allowed, and reset hartsel. dmactive is a reset signal.

I would to clarify this. Please, take a look at #1040.

  1. Even in 0.13, dmactive is listed as R/W, so writing 0 should result in the bit becoming 0 eventually. I don't see an issue with polling it and the bit never becoming 0. But maybe I misunderstood your point.

Oh, I see. I didn't consider this. Thanks!

  1. If there is no DM at the bus address, then the value will always read 0 and the debugger will perceive it as dmactive never goes high. That's not ideal, but I think it's good enough.

Initially, this issue was the least concerning to me. I think it can be mitigated by a descriptive error message on the ED side.

en-sc added a commit to en-sc/riscv-debug-spec that referenced this issue May 23, 2024
… is high

With the clarification, it's obvious that the DM can be reset without
knowing the `hartsel`.

Fixes riscv#1021
rtwfroody added a commit that referenced this issue Jul 11, 2024
This is a clean way to let a debugger reset a DM without having to know
what hartsel is first. It can simply write 0 then 1 to dmactive.

These new rules resolve the issue raised in #1021, that if the DM is
executing an abstract command when the debugger wants to reset it, the
debugger should not be expected to preserve hartsel.
@rtwfroody
Copy link
Collaborator

Asserting dmactive should always be allowed, and reset hartsel. dmactive is a reset signal.

That would have been a better way to write the spec, but not what the spec says. Fixing this is not backwards compatible, and it's not worth it to make a backwards incompatible change to address the extreme corner case where the DM is executing an abstract command and also reading dmcontrol results in an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants