Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB Device getting stuck in .waitDma loop after rapid transfers #482

Open
LukeBorowy opened this issue Mar 22, 2024 · 3 comments
Open

USB Device getting stuck in .waitDma loop after rapid transfers #482

LukeBorowy opened this issue Mar 22, 2024 · 3 comments

Comments

@LukeBorowy
Copy link

When sending and receiving a lot of transfers sequentially, usb_HandleEvents can freeze and never return. This can be demonstrated using the attached project, which is a slightly modified version of the link_library example to send a bunch of transfers. To test this program you will need 2 calculators and the appropriate cable. This only ever happens on the "device" calculator, not the host.

link_library.zip

I added some debugging logs to usbdrvce and recompiled the toolchain to figure out where it was freezing. It is in the _ExecuteDma function, specifically getting stuck in the .waitDma loop. It does exit with an error if you exit on the host calculator, which is kind of weird to me since the device is the one frozen.

I discovered this bug when adding multiplayer support to my game. Normally, I am not sending nearly this much data. However, after a few minutes have passed (anywhere between 1-20), this happens. I believe it has something to do with the exact timing of send and receive transfers finishing, and it just takes a while to get unlucky. It happens very quickly when I spam transfers like this example, since it's much more likely to hit at the bad time. It also occasionally gives me bad/corrupted data on read instead of freezing, but that is harder to reproduce.

Video of the issue: Note when the device stops blinking. Notably, the host seems to think that the transfer of "H" was complete, and that it was now sending "Q". However, the device has frozen before it even returns from reading "H".

IMG_1315.MOV

Hopefully this is just a coding error on my part, but as of now it seems to be in the library.

@acagliano
Copy link

Do you know what the status code that happens is? In my issue which I thought I had fixed but apparently didn't, after a lot of sequential transfers all of a sudden something happens (either device or host, not sure) but any subsequent transfers queued up on the endpoint that is handling a lot start sending error code 80 (10100000 binary) and failing immediately. For the record that error is USB_TRANSFER_CANCELED | USB_TRANSFER_BUS_ERROR.

@LukeBorowy
Copy link
Author

LukeBorowy commented Jul 7, 2024

I don’t have access to calculators to test now, but I’m pretty sure the host didn’t get errors queuing a transfer, and the device didn’t either. For the host it just looked like the transfer was still in progress, not any error. The error that occurred (I think) was in when it was unplugged, at which point the device trying to read (understandably) got 003= USB_TRANSFER_STALLED | USB_TRANSFER_NO_DEVICE.

That’s what makes this so annoying. If the code got any sort of indication of an error when the issue actually happened, I could try to do something to reset the connection to make it respond again. However, the host thinks everything is fine and I can’t do anything on the device since it is frozen, so there’s no way to recover without physically disconnecting them. (I’m pretty sure that I checked for all the statuses on the host, but I can’t confirm that).

I noticed in your linked issue that it only happens with high traffic. In my case, it seems to freeze eventually even with low traffic, leading to my belief that it is something with the precise timings. High traffic just makes it more likely to hit at the “bad” time.

@acagliano
Copy link

Thanks for the response; I did source my issue and they are in fact not related; yours is actually in usbdrvce. Mine was me not doing a step in my driver code properly, though for a while there it was presenting as the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants