Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic RNG #55

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Deterministic RNG #55

wants to merge 15 commits into from

Conversation

LegNeato
Copy link

@LegNeato LegNeato commented May 17, 2022

RENDERED

Feedback is welcome from users at all levels! Do you think this design easy-to-understand, useful and correct? How could the proposed API be improved?

TL; DR

  • A new default plugin, bevy_entropy.
  • The plugin adds a new Entropy resource to the world.
  • The plugin can be completely disabled if no source of entropy is required, the default entropy from the OS can be used if randomness is needed but deterministic execution is not, or a world seed can be specified for deterministic random number generation.
  • Currently uses rand in implementation but not in public API.
  • Currently uses rand_core and rand_chacha in implementation but not in public API.

Prototype

@alice-i-cecile
Copy link
Member

Awesome, I'm looking forward to naming this. A couple of organizational things to make things easier for reviewers:

  1. Could your rename this PR to Deterministic RNG?
  2. Could you add a PR description following the example set in e.g. Stageless: a complete scheduling overhaul #45?

@LegNeato LegNeato changed the title RFC for Entropy Entropy resource May 17, 2022
rfcs/55-entropy.md Outdated Show resolved Hide resolved
@LegNeato LegNeato changed the title Entropy resource Deterministic RNG May 17, 2022
Co-authored-by: Alice Cecile <[email protected]>
@LegNeato
Copy link
Author

Sure, I thought we wanted to use the RFC doc as the source of truth but I can pull some stuff up here! Thanks for the quick response 🚀

@alice-i-cecile
Copy link
Member

Yep, the doc should be the source of truth, but the PR description makes it more reviewer friendly :) The real solution would be to have a dedicated interface for this, but eh, Github is very convenient.

@alice-i-cecile
Copy link
Member

First pass. Substantive thoughts:

  1. I really like the initial implementation model of "just provide a source of entropy". I think this is an important and relatively uncontroversial first step.
  2. Bevy definitely needs a first-class solution for RNG ala rand in the long-term: it will be immediately useful in examples, will likely be eventually useful internally, and like you said, it's very helpful to avoid an explosion of RNG dependencies in real projects.
  3. rand is too heavy, in terms of compile times / memory. If we include it, many users will simply disable bevy_entropy. If we can strip down to getrandom instead I'd be very happy with this proposal.
  4. I want to see some an example of a game that uses entropy across systems, with each one acquiring a ResMut<Entropy> in turn to ensure deterministic evaluation.
  5. I want to see an example demonstrating how you can use the seed with several different RNG crates.
  6. Per-system entropy is trivial and can be quickly addressed in this RFC: just store Local<Entropy>. This should be explicitly discouraged in ecosystem plugins that have gameplay effects though due to the downstream effects on determinism. Seems great for particle effects though!

Stylistic comments:

  1. Ooh, those footnotes and text-hiding sections are cool! Nice work.
  2. Try to split your lines in markdown at the sentence boundaries; it really does make the commits easier to read.

Co-authored-by: Alice Cecile <[email protected]>
@Vrixyz
Copy link
Member

Vrixyz commented May 17, 2022

Per-system entropy is trivial and can be quickly addressed in this RFC: just store Local.

Can we consider using a Component for Entropy rather than a Resource ?

I've seen several kinds of entropy mentioned within the RFC, but I want to point out several kinds of "entropy"/randomness seed can exist in a same application:

  • non-deterministic seed for playing animations for player inputs
  • deterministic seed from user input to load a game map
  • security protocols for networking might also want their own Entropy system

Using Local<Entropy> sounds like a rather impractical way to circumvent that use case ?

Or maybe consider the same approach as Timer to be able to embed it in another Resource or a Component depending on user needs ?

@LegNeato
Copy link
Author

LegNeato commented May 20, 2022

The thinking here is other entropy pools would be seeded from this main pool, giving a single place to control all downstream (and plugins, I guess that is upstream?) determinism.

I'm not opposed to different entropy pools being built in for specific purposes, but I'm not sure of the benefit of having X top-level nodes/pools vs 1 pool that has X subpools created from it. I'm also not an experienced game developer though--I'm coming at this from the simulation side.

@LegNeato
Copy link
Author

I got busy, hope to address feedback later this week.

@infmagic2047
Copy link

infmagic2047 commented May 26, 2022

I think eventually we want to be able to seed from arbitrary-length &[u8], which is useful in games that use a user-provided string as the seed (very common among games with seeded world generation). This can be pretty complex, so probably not in the initial version though.

@infmagic2047
Copy link

3. rand is too heavy, in terms of compile times / memory. If we include it, many users will simply disable bevy_entropy. If we can strip down to getrandom instead I'd be very happy with this proposal.

Going getrandom only would be difficult. Here we have the problem of producing an arbitrary-length random value (to seed the downstream RNGs) from a user-provided seed, which needs some form of RNG or hashing I think, and is not something we should implement from stratch.

A bad seed for the downstream RNG can have real consequences: some RNGs have bad behavior when using non-random seeds. We should try to protect users against these problems.

IMO the best choice here is to find a small crate to get the work done.

@bjorn3
Copy link

bjorn3 commented May 26, 2022

What about only using those rand subcrates we need? Rand consists of rand_core which defines some traits and error types, rand_chacha and rand_pcg which contain PRNG's, rand_dist for generating non-uniform distributions and then the rand crate for glueing everything together and defining a couple more things. I expect rand_core + rand_pcg or rand_chacha to be much lighter than the whole combination. I don't think we need the rest anyway.

@LegNeato
Copy link
Author

That makes sense to me.

@ValorZard
Copy link

I generally like this, but I find the name “entropy” for this crate to be kinda confusing. I think just calling it bevy random would be better

@james7132
Copy link
Member

james7132 commented Jun 5, 2022

I'm generally behind the goals here, but this only covers how we're going to seed new PRNGs, not how said PRNGs are stored, managed, and advanced. Things like query iteration order, thread assignment for systems, multithreaded access, command execution order, platform-specific ordering for types like HashMap, etc. all affect the way game simulation drives the advancement of these PRNGs. Even if we have a deterministic source, a strategy for making the use of those PRNGs is ultimately going to make or break determinism. Particularly if the PRNG or it's results are coming from a first party official crate.

- Do we want this?
- What about `rand`?
- Is `Entropy` naming too obscure?
- How do we deal with security vs speed tradeoff here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question here is really important, and given how "Too Much Security" can have an impact on performance/throughput, we probably need to ask which is the most common case in needing a source of randomness. There will be cases needing more cryptographically secure sources. But for randomising damage rolls, etc, there won't really be a need for something super secure, just random enough and fast. If more use-cases favour needing fast sources than secure sources, the default should be biased towards fast, and secure sources could be an optional feature to be enabled when needed. The random/entropy crate should be well documented and put this on the fore-front about such differences/needs.

Copy link
Author

@LegNeato LegNeato Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is merely the initial entropy pool. So for apps that don't need secure randomness, there is one initial slower call and that can be used to seed a faster rng (likely at startup). Then again, it could be an enormous perf footgun if someone does the obvious thing and just uses one pool. I'm not sure if I want a more common perf footgun or a less common security footgun 🤔.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically created bevy_turborand to explore this, by exposing various Rng structs while making them threadsafe. There's a GlobalRng as a resource to be used for the explicit purpose of seeding RngComponents. As long as GlobalRng is given a specific seed, then everything becomes deterministic from the source.

For non-deterministic Rng, I use instead thread_local Rngs initialised with a generate entropy function, and seed new Rng instances from that. Basically same mechanism as with fastrand does.

The end result of this should be Rngs that are seeded very quickly and with deterministically derived random states. At the moment, it's just WyRand so nothing that is cryptographically secure. But I hope to add ChaCha8 eventually and maybe play around with buffering so to increase throughput as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! That is similar to my vision of where this should go in future rfcs. The fact that most engines in the industry provide rng's means bevy should too, and this building block is a lower level layer that folks can drop down to if they need to customize anything. But it keeps the ecosystem potentially deterministic if both the baked in rngs and any plugin rngs decend from the same seed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I am not sure a thread local RNG is the right grouping fwiw...I like the idea of tying to systems more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rand and fastrand both have mechanisms to create an rng instance from a thread-local seed. This mechanism I would describe as non-deterministic because the intent is to provide as random state (and since these rngs are not threadsafe, they are provided in a thread local state). Which is fine for a lot of cases.

Now, one might want that globally, if given no seed, you derive a random seed (which can be generated in the above manner), and this would be how the GlobalRng gets initialised. From there, every other RngComponent gets its own seeded state deterministically from GlobalRng. So only one point needed to insert randomness while everything else is then deterministic.

@superdump
Copy link

I'm generally behind the goals here, but this only covers how we're going to seed new PRNGs, not how said PRNGs are stored, managed, and advanced. Things like query iteration order, thread assignment for systems, multithreaded access, command execution order, platform-specific ordering for types like HashMap, etc. all affect the way game simulation drives the advancement of these PRNGs. Even if we have a deterministic source, a strategy for making the use of those PRNGs is ultimately going to make or break determinism. Particularly if the PRNG or it's results are coming from a first party official crate.

^ this! You need to have deterministic order of requests for random numbers from the seeded random number generator. The random bits from the generator would be in the same order, but if you request generation of random scale factor and a random index, versus a random index then a random scale factor then the bits are being used for different purposes in a different order and your determinism of the random values just broke.

I'm not initially sure it makes sense to try to build determinism into the engine itself, if it is even possible. It seems more like something one has to consider in the design of one's application.

From hacking on some procedural generation stuff, I would also say that rand will likely not be fast enough for some purposes where a large number of random values are needed. Then other crates will be needed. I suppose it does make sense that the engine could offer a random number generation API, but it is also very easy to just add whatever crate you want and use that. The downside of not offering an API is that every plugin that needs random numbers could/would make their own choice of what crate to use, which introduces bloat.

@alice-i-cecile
Copy link
Member

Worse than bloat, the lack of a coherent standard makes ecosystem level cooperation on determinism much more challenging. If you have five plugins that use RNG, each with their own dependency, can the app developer get to determinism or do they need to vendor all of them?

If rand isn't fast enough, I'm very alright with choosing a different base. Speed is the goal here, not security.

@james7132
Copy link
Member

I'm not initially sure it makes sense to try to build determinism into the engine itself, if it is even possible. It seems more like something one has to consider in the design of one's application.

It depends. We should certainly not impede a developer's efforts to do so. Too many other engines make this an impossibility. It is indeed scoped strictly to anything that affects game simulation. Systems like rendering, audio, etc. where the focus is on presentation clearly do not need to be concerned with it. However, anything that could possibly affect the core game state that drives that presentation (i.e. Transforms) should be deterministic and we should make an effort to keep it that way. With this said, a very conservative line in the sand doesn't just include core systems like transforms, but also animation, AI, etc. as they can impact the game state in a significant way. Hell, particle systems might be in the simulation hotpath depending on how the data is used.

When it comes to PRNGs, the design of Bevy really do not allow for global PRNGs to be used deterministically. The sequence of generated values forces a global PRNG to be used sequentially, which prohibits any form of parallelization, which is unacceptable from a performance perspective. This means we need scoped per-Entity or per-Component PRNGs that need to be consistently seeded, regardless of the rest of the ECS state. It's not particularly possible to force this via Rust's type system, but at least in first party crates, we can enforce certain usage patterns of these PRNGs.

james7132 and others added 4 commits July 3, 2022 18:57
* remove redundant paragraph
* Fix bare urls
* Break on sentence boundaries.
* Update with `rand` changes
* Format
@@ -0,0 +1,453 @@
# Feature Name: `consistent-hierarchy`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file shouldn't be here :) Ditto global-task-pools.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I did was rebase on main.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's weird :(

@LegNeato
Copy link
Author

LegNeato commented Jul 5, 2022

Some non-scientific numbers, on my machine the current implementation PR with rand_core and rand_chacha adds .5-1 second to debug builds vs main (dfe969005264fff54060f9fb148639f80f9cfb29):

MAIN
Finished dev [unoptimized + debuginfo] target(s) in 27.72s
Finished dev [unoptimized + debuginfo] target(s) in 28.65s
Finished dev [unoptimized + debuginfo] target(s) in 28.23s
Finished dev [unoptimized + debuginfo] target(s) in 27.80s
Finished dev [unoptimized + debuginfo] target(s) in 27.71s
Finished dev [unoptimized + debuginfo] target(s) in 27.23s
Finished dev [unoptimized + debuginfo] target(s) in 28.19s
Finished dev [unoptimized + debuginfo] target(s) in 27.61s
ENTROPY
Finished dev [unoptimized + debuginfo] target(s) in 30.18s
Finished dev [unoptimized + debuginfo] target(s) in 30.16s
Finished dev [unoptimized + debuginfo] target(s) in 29.84s
Finished dev [unoptimized + debuginfo] target(s) in 29.30s
Finished dev [unoptimized + debuginfo] target(s) in 29.07s
Finished dev [unoptimized + debuginfo] target(s) in 29.31s

The `bevy_entropy` plugin ships with Bevy and is enabled by default.
If you do not need randomness, you may [disable the plugin](https://docs.rs/bevy/latest/bevy/app/struct.PluginGroupBuilder.html#method.disable) or use Bevy's [minimal set of plugins](https://docs.rs/bevy/latest/bevy/struct.MinimalPlugins.html).

When enabled, `bevy_entropy` provides one world resource: `Entropy`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree Bevy needs seed-able randomness built-in. :)

But I don't think providing a premade resource like this is good idea.
Most standard sized games have more than one RNG.
Take Minecraft for example, you have world randomness, you have mob randomness, etc.
Therefore providing an out of the box singleton could lead to bad practices.

Here's how I'd suggest going about it:

  • The default bevy_entropy should provide traits for RNG types (wrappers or re-exports for randomness crates),
  • The plugin should not be included in the DefaultPlugins and only added explicitly by the user,
  • The plugin would add the Entropy as you call it,
  • Both, the plugin and Entropy would be renamed to something that implies it's not the only source of randomness, something like DebugEntropy is good, because it implies you may want to remove it in a "full game" for something you have more control over.

Then you end up with:

  • Randomness provided through Bevy,
  • Easily installable default seeded RNG for prototyping,
  • Incentive to crate your own sources of RNG when finalizing a feature.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With bevy_rand, the plugin is entirely based on instantiating the resource and component types, registering them for reflection purposes as well. It also is generic as it makes no assumption on the PRNG algorithm the user wants to make use of (given there are different PRNG kinds with different properties and trade-offs). The PRNG types are in bevy_prng in order to account for the reflection traits, etc.

What we get is a Global source of entropy (seedable as well), and component types that can be attached to entities to serve as multiple sources. We get forking between Resource -> Component and also Component -> Component, so with some consideration, we can seed many sources from a single seed in a deterministic but secure manner (no duplicate seeds/states). Sources can be seeded from a thread local, OS/hardware seed source for non-determinism, or from a specified seed.

Needs for a source can vary from the very simple (super simple games in Bevy can be done entirely with Resources anyway), to complex (such as needing strict deterministic seeding and generation across many sources). I would suggest that how people want to use RNG in games is not something we should be biasing one way or another, because how much people want to go down is entirely dependent on their needs. Should we care if they opt for a single global source if that's all they need? Maybe RNG is not used often, mostly during instantiation and it being non-deterministic is fine?

Ultimately, I'm not convinced about forcing "best practices" for games. For crates/plugins, I'm more convinced that there should be particular patterns for RNG sources, so that one plugin does not interfere with another. For games, using global sources is fine, because it will just go into a single binary. Crates however need to have their own sources and also expose the means to set the seeds for the sources they use.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about plugins, definitely more direct than the "good practices" argument, although I still think it's important.

Totally agree with the game dev's decision to use a single RNG, I just don't think it should be implied by the code.
When working with different libraries I often based my thought on the API provided, if the API suggested a solution I'd usually keep using it until it was absolutely necessary to find out better ways. That's why signaling something like this in the name itself could be helpful. (second thought, there is the description)
A counterargument can be made that only advanced devs will need to use multiple RNGs and they're knowledgeable enough to use them. But advanced devs got there somehow, and leaving tips that cost nothing around the code seems like the least we can do.

About the Resource -> Component, Component -> Component. I think what I'm missing is Resource -> Resource or just Seed -> Resource. I think that's probably covered too, but let me know if that's incorrect.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource -> Resource forking is... I think not viable, or a very rare case, as Resources should be treated as a global source, and global sources should not depend on each other. I'd also encourage fewer global sources than lots of global sources, so then the need for Resource -> Resource forking is lessened. Better for Seed -> Resource, as that can be easily done. Global sources should be more generic and broad, not "granular" in ways that might suggest it has a purpose like "mob RNG" or "world RNG". If you want more granular/scoped RNG sources, that should be as a Component/Entity.

Most of this is patterns though, and the best way to really get devs to use these useful patterns is to provide documentation/examples. It's hard to enforce patterns via code/APIs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Seed -> Resource should be enough, since the seed can come from another RNG, another resource, or some configuration/save file.

Good documentation and examples should be enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants