Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible Medley hang #1888

Open
hjellinek opened this issue Nov 26, 2024 · 10 comments
Open

Reproducible Medley hang #1888

hjellinek opened this issue Nov 26, 2024 · 10 comments
Assignees
Labels
bug Something isn't working (as per documentation)

Comments

@hjellinek
Copy link
Contributor

Describe the bug
Running a simple Lisp function that does output to the Exec, Medley hangs with a spinning pizza cursor.

To Reproduce
Steps to reproduce the behavior:

  1. Files loaded: some of my code under development.
  2. Form to eval: (write-bitmap bm #'debug-output-fn #'pixel-to-monochrome-rgba)
  3. What happened? There's a few lines of expected output in the Exec, then it stops and Medley hangs. ^D does nothing and I have to force-quit the ldesdl process.

Expected behavior
I expect to see several more lines of output in the Exec. I expect Medley not to hang.

Screenshots
I attached one screenshot showing the output write-bitmap produced leading up to the hang, though it's unlikely it will show you much. More interestingly, I got macOS to produce a problem report, also attached.

Context (please complete the following information):

  • OS: macOS Ventura
  • OS Version: (13.7.1 (22H221))
  • Host arch: x86_64
  • Maiko version: cea720feb218e25f9927b50e3a123581746a0e52 running SDL
  • IL:MAKESYSDATE: 6-Nov-2024 10:52:14 (my own build)

Additional context
I can provide the lisp.virtualmem file.

crash.log.zip
screenshot_695

@hjellinek hjellinek added the bug Something isn't working (as per documentation) label Nov 26, 2024
@hjellinek hjellinek added this to maiko Nov 26, 2024
@hjellinek
Copy link
Contributor Author

More: I marked a couple of macros as "changed," then tried the test case again. ( didn't actually modify any code.) This time the result was slightly different in that a break window opened, then the system hung.
screenshot_696

@hjellinek
Copy link
Contributor Author

After fixing that error in my program it runs successfully to completion. The only unusual aspect of my code is that it calls multiple-value-bind and values in a loop, once each for every pixel in a 32*16 bitmap. Each values invocation returns 4 fixnums.

@rmkaplan
Copy link
Contributor

rmkaplan commented Nov 26, 2024 via email

@hjellinek
Copy link
Contributor Author

That's too bad, this code would be much less readable and maybe marginally less efficient if I need to cons up an object and return it.
Looking at it another way, this is a pretty simple test case. The key seems to be that an error occurred within the loop, so a test case might be:

loops no more than 512 times
call something that invokes (values 0 0 0 255) with a caller that binds 4 values
triggers an error at some point within the loop

I just created a test function that does that and makes it easy to reproduce the problem.

I'll close this issue and continue the discussion under issue #19.

@hjellinek hjellinek closed this as not planned Won't fix, can't repro, duplicate, stale Nov 26, 2024
@github-project-automation github-project-automation bot moved this to Done in maiko Nov 26, 2024
@hjellinek
Copy link
Contributor Author

GitHub should have closed it as "duplicate."

@hjellinek
Copy link
Contributor Author

I'm reopening this because it seems not to be related to what's reported in issue #19.

From @nbriggs's comment on that issue:

Investigated the macOS crash report - it's a GC problem, possibly only indirectly related to the stack issue.
It was trying to DELREF a cons page (at the end of N_OP_Cons), and took a trap in the reference count handling code -- which is a large macro so doesn't show up nicely in the backtrace, just the source line number of where the macro is invoked, not the line within the macro.
It might help to rewrite the macro as a function (which we could inline when not debugging) - but to catch this, I think we're going to need to have it either fail when we've got a debugger attached, or if it has failed into uraid with a fault can attach a debugger (lldb on macOS) after the fact.

@hjellinek hjellinek reopened this Nov 27, 2024
@nbriggs
Copy link
Contributor

nbriggs commented Nov 27, 2024

The cases where I've seen a crash like this before, though as far as I know not emanating from GC called from N_OP_Cons, have been because other code caused a smash. These were when @rmkaplan was working on the internal data structures of TEdit which involved various pointer bashing exercises. Is there any chance that something that you were working on might have done damage that then sent GC off the deep end?

@hjellinek
Copy link
Contributor Author

The code I've been working doesn't do anything with, for example, raw pointers, nor does it try to outsmart the GC. It's tempting to say it only uses the documented Medley APIs, but the imagestream APIs I'm using aren't as documented as they might be. But my code is using only type-checked operations. And I wasn't using @rmkaplan's experimental TEdit branch.

@nbriggs
Copy link
Contributor

nbriggs commented Nov 27, 2024

@hjellinek do you have the ldesdl executable from the crash? If so, could you zip it up and attach it here. Thanks.

@hjellinek
Copy link
Contributor Author

I just rebuilt it, but Time Machine says this is the previous (Nov 6) version.

ldesdl.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working (as per documentation)
Projects
Status: Done
Development

No branches or pull requests

3 participants