Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add Instruction::MakeArray to SSA #6071

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Conversation

jfecher
Copy link
Contributor

@jfecher jfecher commented Sep 17, 2024

Description

Problem*

Resolves #1733

Summary*

Adds a MakeArray instruction to explicitly create a new array. Now no other ValueIds can contain other ValueIds which simplifies a couple passes.

Additional Context

Still working on getting this working in brillig.

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@TomAFrench
Copy link
Member

Seems like --show-ssa fixes the compilation failures on the few programs I tried.

@TomAFrench
Copy link
Member

Disabling the normalization of valueids shows that we're deleting instructions in the DIE pass which shouldn't be removed. For example the load instruction when returning arr in quicksort is removed from array_sort.

This is happening because somehow the return value of this function (the loaded value) becomes a Value::Function rather than a Value::Instruction. We only mark return values as used if they're constants, instructions or block params so we end up removing the loaded instruction.

@TomAFrench
Copy link
Member

I've merged in master so that this branch is up to date so will be handing this PR back now.

Copy link
Contributor

github-actions bot commented Oct 9, 2024

Changes to Brillig bytecode sizes

Generated at commit: f98500ed125bc4be7c289f184a9e6379fbea64ba, compared to commit: cfc22cb0ca133fce49a25c3f055f5a6b8bd9b58e

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
databus_composite_calldata -151 ✅ -27.61%
databus -28 ✅ -29.47%
databus_two_calldata_simple -36 ✅ -30.77%

Full diff report 👇
Program Brillig opcodes (+/-) %
slices 2,525 (+262) +11.58%
sha2_byte 3,701 (+379) +11.41%
brillig_slices 586 (+54) +10.15%
merkle_insert 835 (+57) +7.33%
bigint 2,242 (+152) +7.27%
schnorr 1,560 (+104) +7.14%
brillig_unitialised_arrays 48 (+1) +2.13%
poseidonsponge_x5_254 4,565 (+69) +1.53%
regression_5252 4,945 (+72) +1.48%
eddsa 11,026 (+141) +1.30%
poseidon_bn254_hash 5,755 (+69) +1.21%
poseidon_bn254_hash_width_3 5,755 (+69) +1.21%
nested_array_dynamic 2,170 (+22) +1.02%
uhashmap 16,319 (+111) +0.68%
hashmap 26,356 (+111) +0.42%
databus_two_calldata 220 (-47) -17.60%
databus_composite_calldata 396 (-151) -27.61%
databus 67 (-28) -29.47%
databus_two_calldata_simple 81 (-36) -30.77%

Copy link
Contributor

github-actions bot commented Oct 9, 2024

Changes to circuit sizes

Generated at commit: f98500ed125bc4be7c289f184a9e6379fbea64ba, compared to commit: cfc22cb0ca133fce49a25c3f055f5a6b8bd9b58e

🧾 Summary (10% most significant diffs)

Program ACIR opcodes (+/-) % Circuit size (+/-) %
hashmap -25,221 ✅ -39.31% -54,701 ✅ -39.65%

Full diff report 👇
Program ACIR opcodes (+/-) % Circuit size (+/-) %
hashmap 38,934 (-25,221) -39.31% 83,258 (-54,701) -39.65%

@jfecher jfecher marked this pull request as ready for review October 9, 2024 11:43
@jfecher jfecher requested a review from a team October 9, 2024 11:43
@jfecher
Copy link
Contributor Author

jfecher commented Oct 9, 2024

This is passing all tests now and showing a surprisingly large improvement in the hashmap test but also large regressions in brillig bytecode sizes.

@TomAFrench
Copy link
Member

This is likely due to the fact that constant arrays are no long considered as "constants" by constant_allocation.rs. We're then not hoisting their usage up to avoid multiple initialisations.

@TomAFrench TomAFrench linked an issue Oct 9, 2024 that may be closed by this pull request
@jfecher
Copy link
Contributor Author

jfecher commented Oct 9, 2024

@TomAFrench hmm I tried updating is_constant_value there but doing so just leads to many panics on ValueIds not being available.

Copy link
Contributor

@michaeljklein michaeljklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jfecher
Copy link
Contributor Author

jfecher commented Oct 9, 2024

Going to hold off on merging this for a bit due to the brillig size regressions

@vezenovm
Copy link
Contributor

vezenovm commented Oct 9, 2024

Going to hold off on merging this for a bit due to the brillig size regressions

Yeah would be good to avoid this regression.

@sirasistant
Copy link
Contributor

sirasistant commented Oct 21, 2024

This is likely due to the fact that constant arrays are no long considered as "constants" by constant_allocation.rs. We're then not hoisting their usage up to avoid multiple initialisations.

yes, I think that's the issue. For example, in slices, we have patterns like this:

    inc_rc v207
    v208 = make_slice [Field 0, Field 0]
    inc_rc v208
    v209 = eq v0, v1
    v210 = not v209
    jmpif v210 then: b18, else: b17
  b18():
    v212 = eq v0, Field 20
    v213 = not v212
    jmpif v213 then: b20, else: b19
  b20():
    v214 = make_slice [Field 0, Field 0, v1]
    v215 = make_slice [Field 0, Field 0, v1]
    v216 = make_slice [Field 0, Field 0, v1]
    jmp b21(u32 3, v216)
  b21(v10: u32, v11: [Field]):
    jmp b22(v10, v11)
  b22(v12: u32, v13: [Field]):
    inc_rc v13
    v217 = eq v12, u32 3
    constrain v12 == u32 3
    v218 = lt u32 2, v12
    constrain v218 == u1 1 '"Index out of bounds"'
    v219 = array_get v13, index u32 2
    v220 = eq v219, Field 10
    constrain v219 == Field 10
    v221 = make_slice [Field 0, Field 0]

Previously, these immediate slices would have been picked up by constant_allocation and hoisted to just 1 single initialization. Now, since array constant initialization has been promoted to a instruction, we actually need to do at the ssa level what constant allocation was doing outside SSA. So in order to avoid any regression we need to identify payloads to make_array that are equal to each other and hoist the make_array to the common dominator of all the arrays that are equal.

Also, if all the values are immediates constant_allocation was hoisting them out of loops to avoid extra runtime cost. But that won't affect bytecode size

Copy link
Contributor

github-actions bot commented Nov 12, 2024

Changes to number of Brillig opcodes executed

Generated at commit: f98500ed125bc4be7c289f184a9e6379fbea64ba, compared to commit: cfc22cb0ca133fce49a25c3f055f5a6b8bd9b58e

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
wildcard_type -195 ✅ -28.43%
databus_two_calldata_simple -49 ✅ -29.70%
databus -41 ✅ -33.33%
fold_2_to_17 -130,344,860 ✅ -99.15%
bench_2_to_17 -144,084,489 ✅ -99.58%

Full diff report 👇
Program Brillig opcodes (+/-) %
side_effects_constrain_array 84 (+14) +20.00%
merkle_insert 3,587 (+231) +6.88%
poseidon_bn254_hash 176,503 (+8,004) +4.75%
poseidon_bn254_hash_width_3 176,503 (+8,004) +4.75%
regression_5252 988,641 (+42,835) +4.53%
poseidonsponge_x5_254 198,048 (+8,211) +4.33%
array_dynamic_blackbox_input 20,108 (+747) +3.86%
slices 3,966 (+136) +3.55%
schnorr 10,489 (+309) +3.04%
brillig_slices 729 (+18) +2.53%
brillig_unitialised_arrays 42 (+1) +2.44%
eddsa 747,745 (+12,086) +1.64%
uhashmap 152,250 (+1,428) +0.95%
regression_struct_array_conditional 1,305 (+11) +0.85%
nested_array_dynamic 3,084 (+22) +0.72%
hashmap 56,600 (+258) +0.46%
brillig_cow_regression 583,578 (-9) -0.00%
regression_4449 219,123 (-9) -0.00%
sha2_byte 47,175 (-7) -0.01%
simple_shield 2,558 (-21) -0.81%
pedersen_hash 566 (-9) -1.57%
brillig_pedersen 1,117 (-18) -1.59%
pedersen_check 1,117 (-18) -1.59%
regression 2,625 (-60) -2.23%
array_to_slice 2,288 (-54) -2.31%
pedersen_commitment 269 (-9) -3.24%
slice_loop 1,216 (-48) -3.80%
databus_two_calldata 444 (-47) -9.57%
databus_composite_calldata 615 (-147) -19.29%
wildcard_type 491 (-195) -28.43%
databus_two_calldata_simple 116 (-49) -29.70%
databus 82 (-41) -33.33%
fold_2_to_17 1,115,643 (-130,344,860) -99.15%
bench_2_to_17 601,271 (-144,084,489) -99.58%

@jfecher
Copy link
Contributor Author

jfecher commented Nov 12, 2024

fold_2_to_17 -130,344,860 ✅ -99.15%
bench_2_to_17 -144,084,489 ✅ -99.58%

How about this for brillig improvements 😅

@jfecher
Copy link
Contributor Author

jfecher commented Nov 12, 2024

Unfortunately the actual brillig bytecode size is still a bit bigger and most brillig programs still execute more opcodes 🤔

Edit: some of them, like brillig_slices may be because of the lack of the array hoisting optimization. The array deduplication check in constant folding doesn't simplify these since they're all in else branches so there's no single case of these that is in a black which dominates all the others. Fixing this seems somewhat difficult since it'd require checking every element of the array, checking which blocks they were defined in, and trying to hoist to the earliest one. Although doing so could potentially lead to more work if those else blocks were never executed and we hoist out of the if.

@sirasistant
Copy link
Contributor

sirasistant commented Nov 14, 2024

Although doing so could potentially lead to more work if those else blocks were never executed and we hoist out of the if

Maybe we could hoist out of loops? that one should always lead to better perf I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

keccakf1600 instructions aren't being deduplicated as expected Add MakeArray instruction to the ssa refactor
6 participants