Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gas griefing/attack via creating the proposals #24

Open
howlbot-integration bot opened this issue Oct 27, 2024 · 16 comments
Open

Gas griefing/attack via creating the proposals #24

howlbot-integration bot opened this issue Oct 27, 2024 · 16 comments
Labels
2 (Med Risk) Assets not at direct risk, but function/availability of the protocol could be impacted or leak value bug Something isn't working downgraded by judge Judge downgraded the risk level of this issue edited-by-warden M-01 primary issue Highest quality submission among a set of duplicates 🤖_primary AI based primary recommendation 🤖_19_group AI based duplicate group recommendation satisfactory satisfies C4 submission criteria; eligible for awards selected for report This submission will be included/highlighted in the audit report sufficient quality report This report is of sufficient quality

Comments

@howlbot-integration
Copy link

Lines of code

https://github.com/code-423n4/2024-10-kleidi/blob/c474b9480850d08514c100b415efcbc962608c62/src/Timelock.sol#L512-L539
https://github.com/code-423n4/2024-10-kleidi/blob/c474b9480850d08514c100b415efcbc962608c62/src/Timelock.sol#L652-L665

Vulnerability details

Explanation

The timelock acts in a way that once the proposals are submitted, they need to be cancelled or executed. This behaviour opens up a griefing attack vector towards the owners of the vault in case at least threshold amount of owners' private keys are exposed.

When the keys are exposed, the attackers can send as many transactions as they need to the network from the safe with different salts. Even if one of the transactions go through, funds can be stolen. The protocol defence mechanisms in these situations is (1) Pause guardian can cancel all the proposals (2) Cold signers can cancel proposals.
Both these defence mechanisms require gas usage from the victim's accounts, and it is important to note that they can not use the funds inside the Kleidi wallet. This can lead to a gas war between attackers and the victims and can cause them to at least cause a griefing attack.

Impact

Assumption in this section is that the victims do not get external help and they have invested most of their liquidity inside Kleidi, and only kept minimal amounts out for gas payments.

  • Imagine if victims have access to F amounts of funds, and 95% of those funds is locked into Kleidi.
  • The proof of concept below shows that the gas consumption of cancel is close to 5% of schedule.
  • In case the keys are compromised, attackers can send many transactions spending G amount of gas. The requires the victims to need to spend 0.05 * G in gas to cancel those proposals.
  • The reward for attackers, only if one of their transactions go through, is 0.95 * F.
  • Given that victims only have access to 0.05 * F to pay for 0.05 * G, if attackers pay more than the funds inside the protocol, meaning (G > F), they can claim the funds in the protocol and drain it as victims do not have enough funds to cancel all proposals.

At the end, attackers can re-claim most of what they spent. Overall spending G - 0.95 * F = G - 0.95 * G = 0.05 * G, and steal 0.95 * G from the user.

Note: In case the victims have invested more than ~95% into the Kleidi, attackers will be able to make profit.

Proof of Concept

Gas consumptions is thoroughly investigated in the test below:

    function testGasConsumption() public {

        bytes32 scheduleSalt = bytes32("saltxyz");
        uint256 numOfProposals = 100000;
        bytes32[] memory saltArray = new bytes32[](numOfProposals);

        for(uint i; i < numOfProposals; i++) {
            saltArray[i] = keccak256(abi.encodePacked("salt", bytes32(i + 1)));
        }

        bytes memory scheduleData = abi.encode(timelock.updateDelay, MINIMUM_DELAY);
        address timelockAddress = address(timelock);


        // initial call costs more gas
        vm.prank(address(safe));
        timelock.schedule(
            timelockAddress,
            0,
            scheduleData,
            scheduleSalt,
            MINIMUM_DELAY
        );

        vm.startPrank(address(safe));
        uint256 gasBeforeSchedule = gasleft();
        for(uint256 i; i < numOfProposals; i++){
            timelock.schedule(
                timelockAddress,
                0,
                scheduleData,
                saltArray[i],
                MINIMUM_DELAY
            );   
        }
        uint256 gasAfterSchedule = gasleft();
        vm.stopPrank();

        bytes32[] memory ids = new bytes32[](numOfProposals);

        for(uint256 i; i < numOfProposals; i++){
            ids[i] = timelock.hashOperation(
                address(timelock),
                0,
                scheduleData,
                saltArray[i]
            );
        }

        vm.startPrank(timelock.pauseGuardian());
        uint256 gasBeforeCancel = gasleft();
        timelock.pause(); // 10000 -> 32,260,154 4.6%
        uint256 gasAfterCancel = gasleft();
        vm.stopPrank();

        // vm.startPrank(address(safe));
        // uint256 gasBeforeCancel = gasleft();
        // for(uint256 i; i < numOfProposals; i++){
        //     timelock.cancel(ids[i]); // 10000 -> 44,890,040  448,900,040 6%
        // }
        // uint256 gasAfterCancel = gasleft();
        // vm.stopPrank();

        // For 100,000 proposals
        // shecdule 7,398,200,040
        // pause guardian pause 340,048,201 ~ 4.6%
        // safe cancel 448,900,040 ~ 6%



        console.log("Gas consumption of schedule: ", gasBeforeSchedule - gasAfterSchedule); // 10000 -> 739,820,040 7,398,200,040
        console.log("Gas consumption of cancel: ", gasBeforeCancel - gasAfterCancel);
    }

Recommended Mitigation Steps

Add epochs to the timelock, each time the contract is paused, move the epoch to the next variable. Also, include epochs in the transaction hashes, and only execute transactions from this epoch. This way, the pause guardian does not need to clear all the transactions one by one, and once the epoch is moved to the next stage, all the previous transactions will be automatically invalidated.

Assessed type

Other

@howlbot-integration howlbot-integration bot added 3 (High Risk) Assets can be stolen/lost/compromised directly 🤖_19_group AI based duplicate group recommendation 🤖_primary AI based primary recommendation bug Something isn't working duplicate-5 edited-by-warden sufficient quality report This report is of sufficient quality labels Oct 27, 2024
howlbot-integration bot added a commit that referenced this issue Oct 27, 2024
@c4-judge
Copy link

GalloDaSballo marked the issue as duplicate of #4

@c4-judge c4-judge reopened this Oct 30, 2024
@c4-judge c4-judge added primary issue Highest quality submission among a set of duplicates selected for report This submission will be included/highlighted in the audit report labels Oct 30, 2024
@c4-judge
Copy link

GalloDaSballo marked the issue as selected for report

@GalloDaSballo
Copy link

Making it primary because of the POC

@c4-judge c4-judge added 2 (Med Risk) Assets not at direct risk, but function/availability of the protocol could be impacted or leak value downgraded by judge Judge downgraded the risk level of this issue and removed 3 (High Risk) Assets can be stolen/lost/compromised directly labels Oct 30, 2024
@c4-judge
Copy link

GalloDaSballo changed the severity to 2 (Med Risk)

@GalloDaSballo
Copy link

I adapted the test, that can be dropped in Timelock.t.sol to verify my statements:

function testGasConsumption() public {

        bytes32 scheduleSalt = bytes32("saltxyz");
        uint256 numOfProposals = 1000;
        bytes32[] memory saltArray = new bytes32[](numOfProposals);

        for(uint i; i < numOfProposals; i++) {
            saltArray[i] = keccak256(abi.encodePacked("salt", bytes32(i + 1)));
        }

        bytes memory scheduleData = abi.encode(timelock.updateDelay, MINIMUM_DELAY);
        address timelockAddress = address(timelock);


        // initial call costs more gas
        vm.prank(address(safe));
        timelock.schedule(
            timelockAddress,
            0,
            scheduleData,
            scheduleSalt,
            MINIMUM_DELAY
        );

        // Schedule until we consume 30 MLN Gas
        vm.startPrank(address(safe));
        uint256 gasBeforeSchedule = gasleft();
        uint256 count;
        while(true) {
            timelock.schedule(
                timelockAddress,
                0,
                scheduleData,
                saltArray[count],
                MINIMUM_DELAY
            );  
            count++; 

            // Stop at 30 MLN gas used
            if(gasBeforeSchedule - gasleft() > 30e6) {
                break;
            } 
        }

        console.log("count", count);

        uint256 gasAfterSchedule = gasleft();
        vm.stopPrank();

        vm.startPrank(timelock.pauseGuardian());
        uint256 gasBeforeCancel = gasleft();
        timelock.pause(); // 10000 -> 32,260,154 4.6%
        uint256 gasAfterCancel = gasleft();
        vm.stopPrank();

        // vm.startPrank(address(safe));
        // uint256 gasBeforeCancel = gasleft();
        // for(uint256 i; i < numOfProposals; i++){
        //     timelock.cancel(ids[i]); // 10000 -> 44,890,040  448,900,040 6%
        // }
        // uint256 gasAfterCancel = gasleft();
        // vm.stopPrank();

        // For 100,000 proposals
        // shecdule 7,398,200,040
        // pause guardian pause 340,048,201 ~ 4.6%
        // safe cancel 448,900,040 ~ 6%



        console.log("Gas consumption of schedule: ", gasBeforeSchedule - gasAfterSchedule); // 10000 -> 739,820,040 7,398,200,040
        console.log("Gas consumption of cancel: ", gasBeforeCancel - gasAfterCancel);
    }

It's worth noting that the POC doesn't work in isolation, leading me to believe that the math given is incorrect

I have ran my POC in both modes, and both versions seems to indicate that the cost to attack is a lot higher than the cost to defend, specifically the attack is 7 times more expensive than defending

I'm not fully confident that Foundry treats the calls as isolated in this way, so I'm happy to be corrected

Result from forge test --match-test testGasConsumption -vv --isolate

Ran 1 test for test/unit/Timelock.t.sol:TimelockUnitTest
[PASS] testGasConsumption() (gas: 33562952)
Logs:
  count 282
  Gas consumption of schedule:  30021964
  Gas consumption of cancel:  4053325

7 times more expensive

Result from forge test --match-test testGasConsumption -vv

Ran 1 test for test/unit/Timelock.t.sol:TimelockUnitTest
[PASS] testGasConsumption() (gas: 25463501)
Logs:
  count 403
  Gas consumption of schedule:  30049168
  Gas consumption of cancel:  1307414

22 times more expensive


Barring a mistake from me, I think the finding is valid and Medium is the most appropriate as the guardian can with some likelihood prevent it as the cost of the attack and the setup is higher than the cost to defend

Also the attack must be done over multiple blocks

@GalloDaSballo
Copy link

Mitigation would require changing the way initiatives are tracked

By simply shifting a "valid ts" all initiatives created and queued before it can be made invalid, this makes the change a O(1) meaning it should not longer be dossable

@ElliotFriedman
Copy link

I think that cost on the attack side is likely more expensive than your PoC shows because it just pranks as the safe, and doesn't generate the signatures, have them validated in the gnosis safe + increment the nonce in the gnosis safe + 21k base transaction cost. When you add all of that together, it would have to be at least 30x more expensive to attack than to defend.

@ElliotFriedman
Copy link

Mitigation is in here: solidity-labs-io/kleidi#53

@GalloDaSballo
Copy link

I generally agree, when also considering memory expansion costs that should happen when dealing with so many signatures

I think Medium Severity is the most appropriate because the attack is IMO not possible in one block, but to say this could have been prevented would be incorrect

Fundamentally, if the guardian doesn't urgently pause, they may not be able to within a few blocks (stricly more than 1)

Medium seems appropriate given this

@gesha17
Copy link

gesha17 commented Nov 5, 2024

Hello @GalloDaSballo,

I think that cost on the attack side is likely more expensive than your PoC shows because it just pranks as the safe, and doesn't generate the signatures, have them validated in the gnosis safe + increment the nonce in the gnosis safe + 21k base transaction cost. When you add all of that together, it would have to be at least 30x more expensive to attack than to defend.

You will need to generate and validate signatures both for attacking and defending. What you listed out applies to both the attack and defender and as a result the costs for either will be a lot closer indicated here or in any of the PoC shown here. Also, the attacker can use a smart contract to batch transactions and lower his gas costs, which would further bring the gap closer.


Addressing the other comment:

Barring a mistake from me, I think the finding is valid and Medium is the most appropriate as the guardian can with some likelihood prevent it as the cost of the attack and the setup is higher than the cost to defend.

As the sponsor commented on #5:

The desired end user is an individual and not a DAO or protocol team.

You can't expect a normal end user to react that fast to save his wallet. The core promise of Kleidi is that if the signer keys are compromised you can safely recover your wallet. The timelock duration can be set up to 30 days. It is expected that if the user reacts in days or even weeks after the initial attack he can still save his wallet. Meanwhile, as I showed with my calculations on the submission #5, the attacker can bring the gas costs in the millions of dollars in a matter of hours.

This attack will result in a complete loss of funds for the users and should be a high severity issue.

@DemoreXTess
Copy link

I don't think it can be a high severity issue. The cost of the attack vector is extremely high. No one ( Attacker ) can take a risk and assume the victim's 95% of the funds is in the wallet. If attacker assumes that and take action based on that, it will be a gambling for the attacker. It can be medium severity issue at most.

We can only validate the issue as DoS because pause functionality is impacted here for guardian role.

Med: Assets not at direct risk, but the function of the protocol or its availability could be impacted, or leak value with a hypothetical attack path with stated assumptions, but external requirements.

@gesha17
Copy link

gesha17 commented Nov 5, 2024

I don't think it can be a high severity issue. The cost of the attack vector is extremely high. No one ( Attacker ) can take a risk and assume the victim's 95% of the funds is in the wallet. If attacker assumes that and take action based on that, it will be a gambling for the attacker. It can be medium severity issue at most.

We can only validate the issue as DoS because pause functionality is impacted here for guardian role.

Med: Assets not at direct risk, but the function of the protocol or its availability could be impacted, or leak value with a hypothetical attack path with stated assumptions, but external requirements.

The attack cost will be high only on mainnet. On L2s the attacker can submit more messages than the victim has time to cancel. Note that the current implementation only allows canceling individual entries, so the user will have to cancel transactions individually. And since he is just a normal end user, we can't assume a sophisticated approach in a limited timeframe.

Suppose a scenario where the victim has 24 hours to cancel the malicios transcations before the timelock delay passed and the attacker takes over his wallet, but there are 100 thousand transactions to cancel. There is no way the user can cancel all of them. This would be valid both on mainnet and L2s.

Also, the DoS is really a side effect and doesnt make much of a difference in this scenario. Considering the fix that the developers implemented, the role of pauser guardian becomes somewhat redundant as long as the users have proper configuration of delays. See my comment on #5 for elaboration on that.

@DemoreXTess
Copy link

Gas price is not important here, there is no difference between L1 and L2 in this attack vector. Defence cost will decrease as attack cost decrease.

Furthermore, in defence part we don't need to race against attacker. We can directly use recovery spell and then we can clear proposals until pause() gas cost reach a healthy level and then pause() can clear all the malicious proposals.

@gesha17
Copy link

gesha17 commented Nov 6, 2024

Gas price is not important here, there is no difference between L1 and L2 in this attack vector. Defence cost will decrease as attack cost decrease.

Furthermore, in defence part we don't need to race against attacker. We can directly use recovery spell and then we can clear proposals until pause() gas cost reach a healthy level and then pause() can clear all the malicious proposals.

I think you are confusing this attack with the simpler DoS vector. The whole point is you cant "clear proposals until pause() gas cost reach a healthy level". That is possible only in the simpler DoS vector if the user configuration is correct, but not here.

@DemoreXTess
Copy link

I understood your point. I want to express my arguments why it shouldn't be classified different than DoS:

  1. Root cause is OOG and DoS in pause()
  2. Attack needs multiple blocks to cause DoS in pause()
  3. We can't assume guardian is inactive here, because this will be unreasonable to assume it, it makes guardian pointless here.
  4. Let say we assumed guardian was inactive while attack, after recovery spell, user can clear the proposals. You can say it's not possible, I understood your point but again you can't assume user will do that one by one, he can clear by external tools like Multicall3 or he can choose another way to do in batch.

Another note: I believe we can't assume guardian is inactive as I stated in argument 3. So, I added my 4th argument for just clarification.

@C4-Staff C4-Staff added the M-01 label Nov 13, 2024
@thebrittfactor thebrittfactor added the satisfactory satisfies C4 submission criteria; eligible for awards label Nov 13, 2024
@thebrittfactor
Copy link

C4 staff have added the satisfactory label for awarding needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 (Med Risk) Assets not at direct risk, but function/availability of the protocol could be impacted or leak value bug Something isn't working downgraded by judge Judge downgraded the risk level of this issue edited-by-warden M-01 primary issue Highest quality submission among a set of duplicates 🤖_primary AI based primary recommendation 🤖_19_group AI based duplicate group recommendation satisfactory satisfies C4 submission criteria; eligible for awards selected for report This submission will be included/highlighted in the audit report sufficient quality report This report is of sufficient quality
Projects
None yet
Development

No branches or pull requests

7 participants