Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-0021 Transformation Breaks Script Integrity #871

Open
fallen-icarus opened this issue Aug 1, 2024 · 18 comments
Open

CIP-0021 Transformation Breaks Script Integrity #871

fallen-icarus opened this issue Aug 1, 2024 · 18 comments

Comments

@fallen-icarus
Copy link

fallen-icarus commented Aug 1, 2024

CIP-21 describes how hardware wallets require transactions to be formatted in order to be signed by them. cardano-cli does not produce transactions that satisfy this CIP so the tx.body file must be transformed after being created (docs). However, the cardano-hw-interop-lib linked in this CIP actually breaks transactions using smart contracts with more complicated redeemers. I opened the corresponding github issue here. The flow is effectively:

  1. Build the transaction with cardano-cli.
  2. Transform the transaction with cardano-hw-cli transaction transform which internally uses the cardano-hw-interop-lib.
  3. Sign the transformed transaction.
  4. Submit the signed transformed transaction.

If there are any smart contracts using complicated redeemers in the transaction, step 4 will always fail with a ScriptIntegrity error. It seems to be the transformation step that is breaking the transaction. You can even do the transformation on transactions being signed by CLI keys and you will still get the ScriptIntegrity submission error. This bug makes it impossible to use cardano-hw-cli to sign these transactions since it requires all transactions to be transformed to satisfy this CIP. I'm not aware of other libraries that enable users to transform and then sign transactions with hardware wallets.

I can't find any mention of redeemers in this CIP so I am now wondering if there is a piece missing from this CIP's Specification. Redeemers use CBOR as well, but the CBOR in the redeemer specifies which action is being taken with that script. Redeemers should NOT be forced to comply with this CIP. Altering the redeemers to "make the integers as small as possible" will actually change what that script will do! I think the ScriptIntegrity error is highlighting that the transformation is changing the redeemers since the integrity hash is a hash of the script + datum + redeemer used for a specific script execution (cddl spec). I think this CIP needs to be updated to explain what should happen with redeemers.

(If the CIP editors do not think this is an appropriate topic for CIP Issues, feel free to close this issue.)

@gitmachtl
Copy link
Contributor

gitmachtl commented Aug 1, 2024

The problem is and will always be that cli does not produce transactions with data in canonical order. HW Wallets have a memory limitation and are therefore not capable of loading whole sets into the memory for checks. The data must be processed in a stream. So f.e. to find duplicated entries, canonical order for HW Wallets is mandatory. Cli refused to switch to canonical order years ago.

So i wonder if this can be somehow taken into account for the redeemer issue you pointed out.

@fallen-icarus
Copy link
Author

So i wonder if this can be somehow taken into account for the redeemer issue you pointed out.

No, it can't. The order of the data for datums and redeemers has internal meaning to smart contracts. There is no such thing as "canonical order" for datums and redeemers. If you change the order of these elements, you will change the behavior of the smart contracts. AFAIK this is what the ScriptIntegrity error means; the transformation is fundamentally changing the behavior of the smart contract being executed. Standardizing datums and redeemers would likely kill a significant number of smart contract use cases because each use case is likely unique enough to require unique datums and redeemers.

I think the simple fix is to just not transform the datums and redeemers. The rest of the transaction can still be standardized. I don't see how this would negatively impact hardware wallet verification.

@rphair
Copy link
Collaborator

rphair commented Aug 1, 2024

@fallen-icarus @gitmachtl I think it will help to get some wallet dev visibility on this & have added (as we do with issues sometimes) to the next CIP meeting agenda (https://hackmd.io/@cip-editors/94).

It seems pretty well documented here so far & I'm glad to see it progressing but I think in addition to raising the red flag at the meeting we also want to alert the affiliated Wallets Working Group about this (cc @Ryun1 @Crypto2099).

@fallen-icarus
Copy link
Author

This seems like a related issue. Perhaps the script data hash just needs to be recalculated after the transformation. I don't know enough typescript to really know what is happening in cardano-hw-interop-lib. As long as the smart contracts behave the same way after the transformation as they did before, I'd be fine with it. IMO this CIP needs to be updated to explain whatever solution is chosen.

@gitmachtl
Copy link
Contributor

gitmachtl commented Aug 1, 2024

So i wonder if this can be somehow taken into account for the redeemer issue you pointed out.

No, it can't. The order of the data for datums and redeemers has internal meaning to smart contracts. There is no such thing as "canonical order" for datums and redeemers. If you change the order of these elements, you will change the behavior of the smart contracts. AFAIK this is what the ScriptIntegrity error means; the transformation is fundamentally changing the behavior of the smart contract being executed. Standardizing datums and redeemers would likely kill a significant number of smart contract use cases because each use case is likely unique enough to require unique datums and redeemers.

I think the simple fix is to just not transform the datums and redeemers. The rest of the transaction can still be standardized. I don't see how this would negatively impact hardware wallet verification.

ok, but only the order of datums and redeemers, not the order of inputs and outputs?

@refi93 @davidmisiak @janmazak can you check if hw-cli is reordering the redeemer/datum while running the transform command?

@fallen-icarus
Copy link
Author

fallen-icarus commented Aug 1, 2024

ok, but only the order of datums and redeemers, not the order of inputs and outputs?

Correct. The internal ordering of the datums and redeemers should be left alone as well. Again, though, I'm not sure what the transformation is actually doing. If it is actually a benign transformation (ie, it has no impact on what the smart contract actually does), it may just be the hash needs to be recalculated.

EDIT: After looking back at the cddl, I think only the internal structure of datums and redeemers is important. The redeemer list can still be sorted, but the redeemers themselves cannot be touched.

@davidmisiak
Copy link

As was mentioned, the canonical encoding of some tx elements is essential for HW wallets security. The reason why CIP-21 requires canonical encoding of the entire tx is mostly practical I believe - usually, CBOR libraries expose only a boolean flag whether the entire structure should be encoded canonically or not. This is also the case of the cbor dependency of cardano-hw-interop-lib. Since there isn't a more granular API, re-encoding only a part of the transaction canonically is very difficult/impossible without implementing much of the CBOR logic.

The specific attributes that determine canonicality are listed here. I'm not experienced with smart contracts - can someone please reliably confirm or deny that a script can access this info? E.g.:

  • Is there a guarantee that when a datum contains a map, the script will be able to obtian the exact order of keys used in the serialized CBOR bytestring representation of the map? (Note that this is not the order of array elements - those are not affected by canonicalization.)

  • From the initial comment:

    Altering the redeemers to "make the integers as small as possible" will actually change what that script will do!

    Note that this doesn't mean changing the value of the integer, it only specifies one of the several ways how to encode the same integer according to CBOR specs. Can the script determine which encoding was used, or is the integer value the only information it sees?

The possible outcomes are:

  1. A smart contract cannot access the details of the datums/redeemers CBOR encoding. In that case, I believe that cardano-hw-interop-lib could reserialize the entire tx canonically and subsequently recalculate and replace the script data hash, like we do with auxiliary data hash.
  2. The canonicality can be detected by a smart contract. I don't know what to do in this case.

@fallen-icarus
Copy link
Author

I think it is better to have smart contract compiler devs answer these questions. They'll know a lot more than me. I am personally using aiken so I will tag them. @rvcas @MicroProofs @KtorZ

@MicroProofs
Copy link
Contributor

As was mentioned, the canonical encoding of some tx elements is essential for HW wallets security. The reason why CIP-21 requires canonical encoding of the entire tx is mostly practical I believe - usually, CBOR libraries expose only a boolean flag whether the entire structure should be encoded canonically or not. This is also the case of the cbor dependency of cardano-hw-interop-lib. Since there isn't a more granular API, re-encoding only a part of the transaction canonically is very difficult/impossible without implementing much of the CBOR logic.

The specific attributes that determine canonicality are listed here. I'm not experienced with smart contracts - can someone please reliably confirm or deny that a script can access this info? E.g.:

  • Is there a guarantee that when a datum contains a map, the script will be able to obtian the exact order of keys used in the serialized CBOR bytestring representation of the map? (Note that this is not the order of array elements - those are not affected by canonicalization.)

So cbor maps from datums and redeemers come in to smart contracts as a list of pairs based on user order. They are not reordered. So if you reordered a datum's map canonically that would make a tx that would fail in phase 1 since the hash of the cbor of the datum would not match the onchain cbor datum for a given input if that input was using a datum hash. There are likely other issues that would be present like invalid budget calculations after reordering maps.

@MicroProofs
Copy link
Contributor

Also currently the onchain would allow the datum to have maps that have multiples of the same item. So best to leave the serialization of datums and redeemers alone.

@fallen-icarus
Copy link
Author

fallen-icarus commented Sep 10, 2024

The reason why CIP-21 requires canonical encoding of the entire tx is mostly practical I believe - usually, CBOR libraries expose only a boolean flag whether the entire structure should be encoded canonically or not. This is also the case of the cbor dependency of cardano-hw-interop-lib. Since there isn't a more granular API, re-encoding only a part of the transaction canonically is very difficult/impossible without implementing much of the CBOR logic.

@davidmisiak What about a non-typescript executable/dependecy? Haskell has the cborg library which I believe does have a more granular API:

cborg targets cases where precise control over the CBOR object structure is needed such as when working with externally-specified CBOR formats.
-- quote from cborg documentation

The serialise library is built upon the cborg library. I have not used either library, but perhaps they can be used to build a more granular transformation tool.

@fallen-icarus
Copy link
Author

fallen-icarus commented Sep 30, 2024

@mpizenberg and I were investigating this issue some more and found the redeemer is changed from an indefinite-list to a definite-list as part of the transformation (in compliance with this CIP). This is the redeemer before:

         82                                                 #       array(2)
            d8 79                                           #         tag(121)
               9f                                           #           array(*)
                  d8 7a                                     #             tag(122)
                     9f                                     #               array(*)
                        58 1c                               #                 bytes(28)
                           c0f8644a01a6bf5db02f4afe30d60497 #                   "\xc0\xf8dJ\x01\xa6\xbf]\xb0/J\xfe0\xd6\x04\x97"
                           5e63dd274f1098a1738e561d         #                   "^c\xdd\'O\x10\x98\xa1s\x8eV\x1d"
                        ff                                  #                 break
                  ff                                        #             break

and this is it after:

         82                                                 #       array(2)
            d8 79                                           #         tag(121)
               81                                           #           array(1)
                  d8 7a                                     #             tag(122)
                     81                                     #               array(1)
                        58 1c                               #                 bytes(28)
                           c0f8644a01a6bf5db02f4afe30d60497 #                   "\xc0\xf8dJ\x01\xa6\xbf]\xb0/J\xfe0\xd6\x04\x97"
                           5e63dd274f1098a1738e561d         #                   "^c\xdd\'O\x10\x98\xa1s\x8eV\x1d"

Plutus Data uses indefinite lists which is in violation of this CIP. I found this note discussing why indefinite-lists are sometimes used for plutus Data. I don't know if this is a situation where the script hash just needs to be re-calculated (and possibly budgets updated).

@mpizenberg
Copy link

The main issue seems to be that CIP21 specifies that lists should have definite lengths. But as far as I understand from my discussions with @KtorZ, the inside of plutus Data should be offlimit for CIP21 because there are specific strict rules about how plutus Data should be encoded. In particular lists in Data are encoded as definite length if 0 length, but indefinite length otherwise.
So in general, I think CIP21 should not touch the inside of plutus Data, to make sure it stays encoded as expected.

@MicroProofs
Copy link
Contributor

@mpizenberg and I were investigating this issue some more and found the redeemer is changed from an indefinite-list to a definite-list as part of the transformation (in compliance with this CIP). This is the redeemer before:

         82                                                 #       array(2)
            d8 79                                           #         tag(121)
               9f                                           #           array(*)
                  d8 7a                                     #             tag(122)
                     9f                                     #               array(*)
                        58 1c                               #                 bytes(28)
                           c0f8644a01a6bf5db02f4afe30d60497 #                   "\xc0\xf8dJ\x01\xa6\xbf]\xb0/J\xfe0\xd6\x04\x97"
                           5e63dd274f1098a1738e561d         #                   "^c\xdd\'O\x10\x98\xa1s\x8eV\x1d"
                        ff                                  #                 break
                  ff                                        #             break

and this is it after:

         82                                                 #       array(2)
            d8 79                                           #         tag(121)
               81                                           #           array(1)
                  d8 7a                                     #             tag(122)
                     81                                     #               array(1)
                        58 1c                               #                 bytes(28)
                           c0f8644a01a6bf5db02f4afe30d60497 #                   "\xc0\xf8dJ\x01\xa6\xbf]\xb0/J\xfe0\xd6\x04\x97"
                           5e63dd274f1098a1738e561d         #                   "^c\xdd\'O\x10\x98\xa1s\x8eV\x1d"

Plutus Data uses indefinite lists which is in violation of this CIP. I found this note discussing why indefinite-lists are sometimes used for plutus Data. I don't know if this is a situation where the script hash just needs to be re-calculated (and possibly budgets updated).

The Script Data hash would need to be recalculated.
What checks does the hardware wallet perform on a transaction and why? I feel like the ledger would catch any mistakes made by invalid transaction signing.

@mpizenberg
Copy link

mpizenberg commented Sep 30, 2024

Hum, it’s weird, I thought Plutus Data had very strict encoding rules, but I just tried to do the same lock-unlock 2ada example, first with indefinite length encoding of the redeemer d8799f4d48656c6c6f2c20576f726c6421ff then with definite length encoding d879814d48656c6c6f2c20576f726c6421 and both worked on preview with my ledger nano x, signing via eternl. Here is the first one: https://preview.cardanoscan.io/transaction/0f5ca9dce62f97c7c1ad1064b246649f002d83874c9d86f70149de4191b38159?tab=contracts

Here is the second one: https://preview.cardanoscan.io/transaction/90706202b45cf9cfb45d402804c5c4015b1660745162e9c7e00f5651ef7748fc?tab=contracts

So on that case, it seems recomputing the script data hash was sufficient, and it worked for both definite/indefinite lengths encoding of the redeemer Data for my hardware wallet ledger nano x. Is the LNX capable of signing non-CIP21 transactions?

@mpizenberg
Copy link

So for the sake of completeness, I just did another test, where I encoded the list of inputs with indefinite length. And this time I got an error from eternl saying Ledger produced a different txHash. Please contact support to help improve Ledger support. So it does seem like the LNX hardware wallet needs definite length encoding for the Tx in general, but not for the inside of Plutus Data.

@mpizenberg
Copy link

mpizenberg commented Sep 30, 2024

I’m no Haskell expert but according to the plutus Data decoders here https://github.com/IntersectMBO/plutus/blob/b086464584f2684658633b375283994dbe04bef8/plutus-core/plutus-core/src/PlutusCore/Data.hs#L190
It seems I was wrong about the specific encoding requirements.
I mean everything EXCEPT bytes longer than 64bytes will be fine with definite length encoding. So if your redeemers are changed from indefinite to definite by a cip21 compliant postprocess, it’s not a problem, but you will need to recompute the script data hash.

So at least the specific issue you were having @fallen-icarus are solvable by recomputing the script data hash.

Now regarding other transformations like deduplication or re-ordering of elements accessible from a plutus contract. As Micro said, it will cause some issues. So if leaving the redeemers and datums unchanged is a possibility for hardware wallets, then that would be ideal.

And in fact, the introduction of the CIP21 spec only mentions restrictions about the transaction body, because that’s the part which is signed by hardware wallets.

Consequently, a HW wallet only provides witness signatures, and the transaction body which was signed has to be reconstructed by the client.

So my guess is the hardware wallets do not inspect the WitnessSet part of the transaction. And consequently, there is no good reason to transform that part of the transaction with CIP21 post-processing tools like cardano-hw-cli.

@KtorZ
Copy link
Member

KtorZ commented Oct 1, 2024

Surely, the cardano-hw-cli and other transformation tools are meant to be used before attempting to sign a transaction? Then, the only remaining "problem" left is the computing of the script integrity hash, which ought to be the responsibility of the cardano-hw-cli IMO; so this issue shall rather be moved there or simply, the tool fixed via a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants