Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize startup time #97

Open
wants to merge 2 commits into
base: 1.21.1
Choose a base branch
from
Open

Conversation

mezz
Copy link

@mezz mezz commented Oct 9, 2024

Hello!

Proposed Changes

This PR optimizes some very hot areas when starting AlmostUnified in a large pack.
I tested this against ATM10 and was able to speed up this mod's loading by about 40%, from 9.5 seconds to 5.7 seconds on my machine. This is a really imprecise measurement but I am confident that it will be at least a little faster for everyone.

Additional Context

Future Work

Almost all the remaining time is spent in com.google.gson.JsonObject#equals, but trying to avoid that would require a big structural change so I stopped here. I think using java maps of values instead of JsonObject for RecipeLink#getActual could give a massive speedup, but there are probably things here I haven't considered so I did not try it.

Performance Comparison (flame graphs)

Before:

(9.5 seconds)
Screenshot 2024-10-09 at 9 03 55 AM

After:

(5.7 seconds)
Screenshot 2024-10-09 at 9 03 40 AM

@@ -60,20 +60,35 @@ public boolean shouldIgnoreRecipe(RecipeLink recipe) {
* @return True if the recipe type is ignored, false otherwise
*/
private boolean isRecipeTypeIgnored(RecipeLink recipe) {
return ignoredRecipeTypesCache.computeIfAbsent(recipe.getType(), type -> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this gets called so much that the capturing lambda actually caused a minor performance hit, so I turned this computeIfAbsent into a regular get/put sequence

Set<String> compareFields = curRecipe.getActual().keySet();
if (!settings.ignoredFields.isEmpty()) {
compareFields = new HashSet<>(compareFields);
compareFields.removeAll(settings.ignoredFields);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the ignored fields happens so frequently that it slows down loading.
It is much faster to do it once here and keeping a list of the fields we need to compare.

@rlnt
Copy link
Member

rlnt commented Oct 16, 2024

Thanks a lot for this pull request and the explanation. Since I don't have a lot of time right now, it might take me a bit to look at this more closely. A lot of this is new to me. I would have assumed the JIT optimizes things like the lambda. And I also need to learn how to read flame graphs. 😄

About the equality checks of JsonObjects. I would need some more context on your suggestion of using Java maps. As far as I understand it, JsonObjects are also just map-like objects and this shouldn't make a huge difference.
We were also not happy with the approach of comparing the JSON structures just to tell if something had changed. But when working on the 1.0 update for Minecraft 1.21 and above, we implemented a new approach with something we called a JsonCursor. That approach used partial comparisons only and we even tried another approach of methods telling us whether something in the JSON changed to avoid comparison at all. However, this made no noticeable performance difference and we reverted to comparison.

The comparison also has the huge advantage of simplicity. With a proper API introduced in 1.0, mod authors can register their own custom unifiers that tell Almost Unified how to treat their recipe JSONs. Since we compare the JSON afterward, we can tell whether something changed without it being a burden for mod authors to track whether their unifier changed something.

@mezz
Copy link
Author

mezz commented Oct 18, 2024

No rush!

A flame graph is like a stack trace. The bottom is the first parent method, and then on top of it are the children methods called from it.
The children methods are sorted left to right so that the ones that take more time are on the left.
The pictures I shared aren't going to give you a perfect sense of what changed, but you can see roughly where time was spent before vs after.

The capturing lambda is pretty fast in Java but it has to create something like an object in order to store (capture) the current state outside of itself when it is created, in order to access the variables it wants to use when it is called. Normally you can ignore the performance hit, it really doesn't matter, but in this specific case it's being called an incredible number of times so it actually has some measurable impact.

JsonObject is great here because it does exactly what you want to do and does it simply, I think you had the right idea to use it. Replacing it with anything better may be more work.
Unfortunately Gson does something a bit lazy with the values by design, they're stored in a way that's really generic so equals and hashcode comparisons have a lot of extra overhead. Each value is put into a JsonPrimitive which can hold a String, Integer, Long, Double, etc.
Take a look at the equals method and you'll see why it can be a bit slow when called a bajillion times in a row:
https://github.com/google/gson/blob/2ce6a6106c22bf0cce2e39f68b0da2f117d6e66b/gson/src/main/java/com/google/gson/JsonPrimitive.java#L267-L295
It's sort of parsing the values every time you want to do something with the JsonPrimitive.

We can do better than the Gson implementation because we already know the actual type of the values after parsing the Json, so we can create and do comparisons on a simple Map<String, Object> (where the Object is parsed value like an Integer or String), which is more efficient than doing comparisons on JsonObject's internal Map<String, JsonElement>.
I think it should be possible to create that map before doing the comparisons, so the API and everything else stay the same. I just didn't want to get into it in the same PR because it's a lot more complicated to try it out and justify heh.

@mezz
Copy link
Author

mezz commented Nov 4, 2024

updated to resolve merge conflicts

@@ -270,6 +268,17 @@ public String getName() {
}
}

public record CompareContext(CompareSettings settings, List<String> compareFields) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason you are using a list instead of a set here? Could even be a collection or iterable only. I am just asking because inside the create method, you retrieve the recipe key set and when the ignored fields are not empty, you copy the collection twice. Once inside the new HashSet and then with List.copyOf inside the CTor call. Is is because you want to make it immutable again?

Copy link
Author

@mezz mezz Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is focused on efficiency of the hot path, which in this case (according to profiling) is the path that iterates over the compareFields. We can sloppily make lots of copies elsewhere and it'll take less than 1ms in total, but if we iterate over a HashSet in a hot loop it will take longer than a List for example, so the main requirement is to enforce compareFields is a type that is fast to iterate over.

The Set is used earlier because it's a quick way to remove any duplicates and remove the ignored fields. You can do the same thing in many ways but there are tradeoffs between copying, contains performance, and clarity. Since the code there isn't called as much, I think I just tried to make it easy to read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants