Optimize startup time #97

mezz · 2024-10-09T00:07:10Z

Hello!

Proposed Changes

This PR optimizes some very hot areas when starting AlmostUnified in a large pack.
I tested this against ATM10 and was able to speed up this mod's loading by about 40%, from 9.5 seconds to 5.7 seconds on my machine. This is a really imprecise measurement but I am confident that it will be at least a little faster for everyone.

Additional Context

Future Work

Almost all the remaining time is spent in com.google.gson.JsonObject#equals, but trying to avoid that would require a big structural change so I stopped here. I think using java maps of values instead of JsonObject for RecipeLink#getActual could give a massive speedup, but there are probably things here I haven't considered so I did not try it.

Performance Comparison (flame graphs)

Before:

(9.5 seconds)

After:

(5.7 seconds)

mezz · 2024-10-09T00:22:25Z

Common/src/main/java/com/almostreliable/unified/config/DuplicateConfig.java

@@ -60,20 +60,35 @@ public boolean shouldIgnoreRecipe(RecipeLink recipe) {
     * @return True if the recipe type is ignored, false otherwise
     */
    private boolean isRecipeTypeIgnored(RecipeLink recipe) {
-        return ignoredRecipeTypesCache.computeIfAbsent(recipe.getType(), type -> {


this gets called so much that the capturing lambda actually caused a minor performance hit, so I turned this computeIfAbsent into a regular get/put sequence

mezz · 2024-10-09T00:25:39Z

Common/src/main/java/com/almostreliable/unified/utils/JsonCompare.java

+            Set<String> compareFields = curRecipe.getActual().keySet();
+            if (!settings.ignoredFields.isEmpty()) {
+                compareFields = new HashSet<>(compareFields);
+                compareFields.removeAll(settings.ignoredFields);


Checking the ignored fields happens so frequently that it slows down loading.
It is much faster to do it once here and keeping a list of the fields we need to compare.

rlnt · 2024-10-16T09:36:22Z

Thanks a lot for this pull request and the explanation. Since I don't have a lot of time right now, it might take me a bit to look at this more closely. A lot of this is new to me. I would have assumed the JIT optimizes things like the lambda. And I also need to learn how to read flame graphs. 😄

About the equality checks of JsonObjects. I would need some more context on your suggestion of using Java maps. As far as I understand it, JsonObjects are also just map-like objects and this shouldn't make a huge difference.
We were also not happy with the approach of comparing the JSON structures just to tell if something had changed. But when working on the 1.0 update for Minecraft 1.21 and above, we implemented a new approach with something we called a JsonCursor. That approach used partial comparisons only and we even tried another approach of methods telling us whether something in the JSON changed to avoid comparison at all. However, this made no noticeable performance difference and we reverted to comparison.

The comparison also has the huge advantage of simplicity. With a proper API introduced in 1.0, mod authors can register their own custom unifiers that tell Almost Unified how to treat their recipe JSONs. Since we compare the JSON afterward, we can tell whether something changed without it being a burden for mod authors to track whether their unifier changed something.

mezz · 2024-10-18T10:35:25Z

No rush!

A flame graph is like a stack trace. The bottom is the first parent method, and then on top of it are the children methods called from it.
The children methods are sorted left to right so that the ones that take more time are on the left.
The pictures I shared aren't going to give you a perfect sense of what changed, but you can see roughly where time was spent before vs after.

The capturing lambda is pretty fast in Java but it has to create something like an object in order to store (capture) the current state outside of itself when it is created, in order to access the variables it wants to use when it is called. Normally you can ignore the performance hit, it really doesn't matter, but in this specific case it's being called an incredible number of times so it actually has some measurable impact.

JsonObject is great here because it does exactly what you want to do and does it simply, I think you had the right idea to use it. Replacing it with anything better may be more work.
Unfortunately Gson does something a bit lazy with the values by design, they're stored in a way that's really generic so equals and hashcode comparisons have a lot of extra overhead. Each value is put into a JsonPrimitive which can hold a String, Integer, Long, Double, etc.
Take a look at the equals method and you'll see why it can be a bit slow when called a bajillion times in a row:
https://github.com/google/gson/blob/2ce6a6106c22bf0cce2e39f68b0da2f117d6e66b/gson/src/main/java/com/google/gson/JsonPrimitive.java#L267-L295
It's sort of parsing the values every time you want to do something with the JsonPrimitive.

We can do better than the Gson implementation because we already know the actual type of the values after parsing the Json, so we can create and do comparisons on a simple Map<String, Object> (where the Object is parsed value like an Integer or String), which is more efficient than doing comparisons on JsonObject's internal Map<String, JsonElement>.
I think it should be possible to create that map before doing the comparisons, so the API and everything else stay the same. I just didn't want to get into it in the same PR because it's a lot more complicated to try it out and justify heh.

mezz · 2024-11-04T02:18:08Z

updated to resolve merge conflicts

rlnt · 2024-12-01T19:31:12Z

Common/src/main/java/com/almostreliable/unified/utils/JsonCompare.java

@@ -270,6 +268,17 @@ public String getName() {
        }
    }

+    public record CompareContext(CompareSettings settings, List<String> compareFields) {


Is there a specific reason you are using a list instead of a set here? Could even be a collection or iterable only. I am just asking because inside the create method, you retrieve the recipe key set and when the ignored fields are not empty, you copy the collection twice. Once inside the new HashSet and then with List.copyOf inside the CTor call. Is is because you want to make it immutable again?

The PR is focused on efficiency of the hot path, which in this case (according to profiling) is the path that iterates over the compareFields. We can sloppily make lots of copies elsewhere and it'll take less than 1ms in total, but if we iterate over a HashSet in a hot loop it will take longer than a List for example, so the main requirement is to enforce compareFields is a type that is fast to iterate over.

The Set is used earlier because it's a quick way to remove any duplicates and remove the ignored fields. You can do the same thing in many ways but there are tradeoffs between copying, contains performance, and clarity. Since the code there isn't called as much, I think I just tried to make it easy to read.

Optimize startup time

8dd052c

mezz commented Oct 9, 2024

View reviewed changes

Merge branch '1.21.1' into optimize-load

3dbc624

rlnt reviewed Dec 1, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize startup time #97

Optimize startup time #97

mezz commented Oct 9, 2024

mezz Oct 9, 2024

mezz Oct 9, 2024

rlnt commented Oct 16, 2024

mezz commented Oct 18, 2024 •

edited

Loading

mezz commented Nov 4, 2024

rlnt Dec 1, 2024

mezz Dec 2, 2024 •

edited

Loading

Optimize startup time #97

Are you sure you want to change the base?

Optimize startup time #97

Conversation

mezz commented Oct 9, 2024

Proposed Changes

Additional Context

Future Work

Performance Comparison (flame graphs)

mezz Oct 9, 2024

Choose a reason for hiding this comment

mezz Oct 9, 2024

Choose a reason for hiding this comment

rlnt commented Oct 16, 2024

mezz commented Oct 18, 2024 • edited Loading

mezz commented Nov 4, 2024

rlnt Dec 1, 2024

Choose a reason for hiding this comment

mezz Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

mezz commented Oct 18, 2024 •

edited

Loading

mezz Dec 2, 2024 •

edited

Loading