Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release vere-v2.10 to live #474

Merged
merged 75 commits into from
Jun 27, 2023
Merged

Release vere-v2.10 to live #474

merged 75 commits into from
Jun 27, 2023

Conversation

pkova
Copy link
Collaborator

@pkova pkova commented Jun 27, 2023

No description provided.

joemfb and others added 30 commits February 13, 2023 08:38
See vere#405.

This implements a "roll-your-own swapfile", where we designate a file
(.urb/ephemeral.bin) to file-back any parts of the loom which are not
backed by the snapshot.  These are pages in the contiguous free space,
and pages that have been dirtied since the last snapshot.

With these changes, you should be able to safely reduce the allotted
memory to a VM running a machine without relying on separate swap space.

The main disadvantages to this approach are:
- ephemeral.bin is large (the size of the loom).  We could delete it on
  graceful shutdown, or we could make it a sparse file.
- _ce_flaw_protect has to copy the contents of dirtied pages twice.  I'm
  skeptical that this introduces significant slowness, but it could.

It may be reasonable to put this behavior behind a flag simply because
of the increased disk requirements.
per review comments
belisarius222 and others added 28 commits June 15, 2023 11:33
This PR is the simplest available fix for #451.
Resolves #456.

Written live for a tutorial with @belisarius222
This PR ports urbit/urbit#6159, fixing a performance problem that
plagued previous porting attempts. Fixes #157, supersedes #210 and #413.

The poor performance observed in #210 and elsewhere was not due to any
issue matching or dispatching jets. It coincided with the switch from
hoon %140 to %139, but only incidentally. It was caused by a change to
the `+solid` pill generator, which inadvertently broke the structural
sharing in the lifecycle sequence (see
https://github.com/urbit/urbit/pull/5989/files#diff-2f8df9d079ccb58c0a9a9c46f2f7dbd943dabaa21ba658c839de757bbac999f1L108-L116).
The problem was unnoticed because, in normal (ie, king/serf) boot and
replay, events are sent over IPC in batches, which had the side effect
of recovering the necessary structural sharing. This new replay
implementation does not involve IPC, but instead reads and computes
events synchronously, in a single process.

The issue did not arise until ships booted from pills created with the
updated generator were replayed using this new implementation, and that
happened to coincide with the release of hoon %139. The absence of
structural shared lead to jets being registered with one copy of the
kernel, but dispatched from a separate copy, resulting in absurdly
expensive equality comparisons. Since both copies were already allocated
on the home-road, unification could not be performed. And since the
problem manifested during the initial phase (lifecycle sequence) of the
boot process, `|meld` could not be used.

This PR includes a trivial hack to work around such event logs: the
lifecycle sequence is read in an inner road, jammed, and then cue'd,
thus recovering structural sharing before any nock computation, jet
registration, &c. The solid pill generator should also be fixed, but
workarounds will still be needed to account for existing piers.
Longer-term, home-road unification should clearly be explored to avoid
such fragility.
These are the vere changes that accompany urbit/urbit#6669, see that for
a description.
Highly relevant to #410 (I think this behavior should be always-on in
the presence of "swap"), but also likely useful as a standalone option
for low-memory deployments. This is a draft PR as the behavior is
hardcoded, not controlled by command-line arguments.
See #405

This implements a "roll-your-own swapfile", where we designate a file
(.urb/ephemeral.bin) to file-back any parts of the loom which are not
backed by the snapshot. These are pages in the contiguous free space,
and pages that have been dirtied since the last snapshot.

With these changes, you should be able to safely reduce the allotted
memory to a VM running a machine without relying on separate swap space.

The main disadvantages to this approach are:
- ephemeral.bin is large (the size of the loom). We could delete it on
graceful shutdown, or we could make it a sparse file.
- _ce_flaw_protect has to copy the contents of dirtied pages twice. I'm
skeptical that this introduces significant slowness, but it could.

It may be reasonable to put this behavior behind a flag simply because
of the increased disk requirements.

I've successfully tested various scenarios in a limited memory
environment, but this is not well enough tested to be ready for merging.
Main things I've tested (all on a machine with 1GB RAM and no swap):
- Boot
- Allocate 8 128MB atoms in the dojo in a row, verifying |mass is 1.2GB
afterward
- Free those atoms, run |pack, recreate them, free them again, |pack
again
- Restart ship, recreate those atoms, free them, and |pack

All of this worked fine on that machine, even though the ephemeral
memory usage must have been greater than the RAM on the machine, and
there were no swap files. During most of this, the htop-reported
"resident memory" for the serf process was around 400-600MB, while the
amount of memory "used" overall per `free -h` was under 180-250MB every
time I checked. This is consistent with the OS deciding how much of the
backing files to keep resident at any given time, while the amount of
memory strictly required stays fairly small.

Testing on a machine with plenty of memory, I didn't notice any
slowdown. On the 1GB machine, it was somewhat slow to allocate the large
atoms, but it felt about right considering they must have been written
to the backing file. Graceful shutdown was somewhat slow sometimes,
presumably when it needed to copy from the ephemeral file to the
snapshot.
In testing, I commonly use `=a (bex (bex 29))` as a way to use 128MB of
memory. It's been annoying that you can't do 256MB or more in this way,
because the +bex jet didn't support it. This PR raises that limit to
strictly less than 2GB. This also reduces the ephemeral memory usage of
the `+bex` jet on large numbers by half.

This was limited to strictly less than 256MB because it used a gmp
method to perform the binary exponent. I briefly looked at whether gmp
would handle double-length words for the exponent, but decided it was
simpler and more efficient anyway to write this jet directly, ie
allocate `a+1` bits and set the `a`th bit. This saves one copy of the
entire result, and allows it to function up to 2GB minus one.

This function would work correctly above this limit, but there are other
parts of the system that seem to implicitly assume that atoms cannot be
2GB or larger. The first one I ran into was `mug`, but there may be
others; after all, for a long time it was safe to assume nothing could
be larger than 2GB because that was the loom size.
These checks were introduced in v2.7, aborting the process if the
snapshot metadata indicated that truncation had occurred. But the check
as written is unnecessarily strict, and also aborts the process if the
snapshot was larger than necessary. This PR prints a warning in that
case and otherwise continues.
@pkova pkova requested a review from a team as a code owner June 27, 2023 15:15
@pkova pkova merged commit 463dec7 into master Jun 27, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants