Gas griefing/attack via creating the proposals #24

howlbot-integration · 2024-10-27T19:54:58Z

Lines of code

https://github.com/code-423n4/2024-10-kleidi/blob/c474b9480850d08514c100b415efcbc962608c62/src/Timelock.sol#L512-L539
https://github.com/code-423n4/2024-10-kleidi/blob/c474b9480850d08514c100b415efcbc962608c62/src/Timelock.sol#L652-L665

Vulnerability details

Explanation

The timelock acts in a way that once the proposals are submitted, they need to be cancelled or executed. This behaviour opens up a griefing attack vector towards the owners of the vault in case at least threshold amount of owners' private keys are exposed.

When the keys are exposed, the attackers can send as many transactions as they need to the network from the safe with different salts. Even if one of the transactions go through, funds can be stolen. The protocol defence mechanisms in these situations is (1) Pause guardian can cancel all the proposals (2) Cold signers can cancel proposals.
Both these defence mechanisms require gas usage from the victim's accounts, and it is important to note that they can not use the funds inside the Kleidi wallet. This can lead to a gas war between attackers and the victims and can cause them to at least cause a griefing attack.

Impact

Assumption in this section is that the victims do not get external help and they have invested most of their liquidity inside Kleidi, and only kept minimal amounts out for gas payments.

Imagine if victims have access to F amounts of funds, and 95% of those funds is locked into Kleidi.
The proof of concept below shows that the gas consumption of cancel is close to 5% of schedule.
In case the keys are compromised, attackers can send many transactions spending G amount of gas. The requires the victims to need to spend 0.05 * G in gas to cancel those proposals.
The reward for attackers, only if one of their transactions go through, is 0.95 * F.
Given that victims only have access to 0.05 * F to pay for 0.05 * G, if attackers pay more than the funds inside the protocol, meaning (G > F), they can claim the funds in the protocol and drain it as victims do not have enough funds to cancel all proposals.

At the end, attackers can re-claim most of what they spent. Overall spending G - 0.95 * F = G - 0.95 * G = 0.05 * G, and steal 0.95 * G from the user.

Note: In case the victims have invested more than ~95% into the Kleidi, attackers will be able to make profit.

Proof of Concept

Gas consumptions is thoroughly investigated in the test below:

    function testGasConsumption() public {

        bytes32 scheduleSalt = bytes32("saltxyz");
        uint256 numOfProposals = 100000;
        bytes32[] memory saltArray = new bytes32[](numOfProposals);

        for(uint i; i < numOfProposals; i++) {
            saltArray[i] = keccak256(abi.encodePacked("salt", bytes32(i + 1)));
        }

        bytes memory scheduleData = abi.encode(timelock.updateDelay, MINIMUM_DELAY);
        address timelockAddress = address(timelock);


        // initial call costs more gas
        vm.prank(address(safe));
        timelock.schedule(
            timelockAddress,
            0,
            scheduleData,
            scheduleSalt,
            MINIMUM_DELAY
        );

        vm.startPrank(address(safe));
        uint256 gasBeforeSchedule = gasleft();
        for(uint256 i; i < numOfProposals; i++){
            timelock.schedule(
                timelockAddress,
                0,
                scheduleData,
                saltArray[i],
                MINIMUM_DELAY
            );   
        }
        uint256 gasAfterSchedule = gasleft();
        vm.stopPrank();

        bytes32[] memory ids = new bytes32[](numOfProposals);

        for(uint256 i; i < numOfProposals; i++){
            ids[i] = timelock.hashOperation(
                address(timelock),
                0,
                scheduleData,
                saltArray[i]
            );
        }

        vm.startPrank(timelock.pauseGuardian());
        uint256 gasBeforeCancel = gasleft();
        timelock.pause(); // 10000 -> 32,260,154 4.6%
        uint256 gasAfterCancel = gasleft();
        vm.stopPrank();

        // vm.startPrank(address(safe));
        // uint256 gasBeforeCancel = gasleft();
        // for(uint256 i; i < numOfProposals; i++){
        //     timelock.cancel(ids[i]); // 10000 -> 44,890,040  448,900,040 6%
        // }
        // uint256 gasAfterCancel = gasleft();
        // vm.stopPrank();

        // For 100,000 proposals
        // shecdule 7,398,200,040
        // pause guardian pause 340,048,201 ~ 4.6%
        // safe cancel 448,900,040 ~ 6%



        console.log("Gas consumption of schedule: ", gasBeforeSchedule - gasAfterSchedule); // 10000 -> 739,820,040 7,398,200,040
        console.log("Gas consumption of cancel: ", gasBeforeCancel - gasAfterCancel);
    }

Recommended Mitigation Steps

Add epochs to the timelock, each time the contract is paused, move the epoch to the next variable. Also, include epochs in the transaction hashes, and only execute transactions from this epoch. This way, the pause guardian does not need to clear all the transactions one by one, and once the epoch is moved to the next stage, all the previous transactions will be automatically invalidated.

Assessed type

Other

The text was updated successfully, but these errors were encountered:

c4-judge · 2024-10-30T13:34:01Z

GalloDaSballo marked the issue as duplicate of #4

c4-judge · 2024-10-30T13:54:21Z

GalloDaSballo marked the issue as selected for report

GalloDaSballo · 2024-10-30T13:56:07Z

Making it primary because of the POC

c4-judge · 2024-10-30T15:40:17Z

GalloDaSballo changed the severity to 2 (Med Risk)

GalloDaSballo · 2024-10-30T15:46:59Z

I adapted the test, that can be dropped in Timelock.t.sol to verify my statements:

function testGasConsumption() public {

        bytes32 scheduleSalt = bytes32("saltxyz");
        uint256 numOfProposals = 1000;
        bytes32[] memory saltArray = new bytes32[](numOfProposals);

        for(uint i; i < numOfProposals; i++) {
            saltArray[i] = keccak256(abi.encodePacked("salt", bytes32(i + 1)));
        }

        bytes memory scheduleData = abi.encode(timelock.updateDelay, MINIMUM_DELAY);
        address timelockAddress = address(timelock);


        // initial call costs more gas
        vm.prank(address(safe));
        timelock.schedule(
            timelockAddress,
            0,
            scheduleData,
            scheduleSalt,
            MINIMUM_DELAY
        );

        // Schedule until we consume 30 MLN Gas
        vm.startPrank(address(safe));
        uint256 gasBeforeSchedule = gasleft();
        uint256 count;
        while(true) {
            timelock.schedule(
                timelockAddress,
                0,
                scheduleData,
                saltArray[count],
                MINIMUM_DELAY
            );  
            count++; 

            // Stop at 30 MLN gas used
            if(gasBeforeSchedule - gasleft() > 30e6) {
                break;
            } 
        }

        console.log("count", count);

        uint256 gasAfterSchedule = gasleft();
        vm.stopPrank();

        vm.startPrank(timelock.pauseGuardian());
        uint256 gasBeforeCancel = gasleft();
        timelock.pause(); // 10000 -> 32,260,154 4.6%
        uint256 gasAfterCancel = gasleft();
        vm.stopPrank();

        // vm.startPrank(address(safe));
        // uint256 gasBeforeCancel = gasleft();
        // for(uint256 i; i < numOfProposals; i++){
        //     timelock.cancel(ids[i]); // 10000 -> 44,890,040  448,900,040 6%
        // }
        // uint256 gasAfterCancel = gasleft();
        // vm.stopPrank();

        // For 100,000 proposals
        // shecdule 7,398,200,040
        // pause guardian pause 340,048,201 ~ 4.6%
        // safe cancel 448,900,040 ~ 6%



        console.log("Gas consumption of schedule: ", gasBeforeSchedule - gasAfterSchedule); // 10000 -> 739,820,040 7,398,200,040
        console.log("Gas consumption of cancel: ", gasBeforeCancel - gasAfterCancel);
    }

It's worth noting that the POC doesn't work in isolation, leading me to believe that the math given is incorrect

I have ran my POC in both modes, and both versions seems to indicate that the cost to attack is a lot higher than the cost to defend, specifically the attack is 7 times more expensive than defending

I'm not fully confident that Foundry treats the calls as isolated in this way, so I'm happy to be corrected

Result from forge test --match-test testGasConsumption -vv --isolate

Ran 1 test for test/unit/Timelock.t.sol:TimelockUnitTest
[PASS] testGasConsumption() (gas: 33562952)
Logs:
  count 282
  Gas consumption of schedule:  30021964
  Gas consumption of cancel:  4053325

7 times more expensive

Result from forge test --match-test testGasConsumption -vv

Ran 1 test for test/unit/Timelock.t.sol:TimelockUnitTest
[PASS] testGasConsumption() (gas: 25463501)
Logs:
  count 403
  Gas consumption of schedule:  30049168
  Gas consumption of cancel:  1307414

22 times more expensive

Barring a mistake from me, I think the finding is valid and Medium is the most appropriate as the guardian can with some likelihood prevent it as the cost of the attack and the setup is higher than the cost to defend

Also the attack must be done over multiple blocks

GalloDaSballo · 2024-10-30T15:47:49Z

Mitigation would require changing the way initiatives are tracked

By simply shifting a "valid ts" all initiatives created and queued before it can be made invalid, this makes the change a O(1) meaning it should not longer be dossable

ElliotFriedman · 2024-10-30T17:11:22Z

I think that cost on the attack side is likely more expensive than your PoC shows because it just pranks as the safe, and doesn't generate the signatures, have them validated in the gnosis safe + increment the nonce in the gnosis safe + 21k base transaction cost. When you add all of that together, it would have to be at least 30x more expensive to attack than to defend.

ElliotFriedman · 2024-10-30T17:13:12Z

Mitigation is in here: solidity-labs-io/kleidi#53

GalloDaSballo · 2024-10-31T07:56:36Z

I generally agree, when also considering memory expansion costs that should happen when dealing with so many signatures

I think Medium Severity is the most appropriate because the attack is IMO not possible in one block, but to say this could have been prevented would be incorrect

Fundamentally, if the guardian doesn't urgently pause, they may not be able to within a few blocks (stricly more than 1)

Medium seems appropriate given this

gesha17 · 2024-11-05T08:19:16Z

Hello @GalloDaSballo,

I think that cost on the attack side is likely more expensive than your PoC shows because it just pranks as the safe, and doesn't generate the signatures, have them validated in the gnosis safe + increment the nonce in the gnosis safe + 21k base transaction cost. When you add all of that together, it would have to be at least 30x more expensive to attack than to defend.

You will need to generate and validate signatures both for attacking and defending. What you listed out applies to both the attack and defender and as a result the costs for either will be a lot closer indicated here or in any of the PoC shown here. Also, the attacker can use a smart contract to batch transactions and lower his gas costs, which would further bring the gap closer.

Addressing the other comment:

Barring a mistake from me, I think the finding is valid and Medium is the most appropriate as the guardian can with some likelihood prevent it as the cost of the attack and the setup is higher than the cost to defend.

As the sponsor commented on #5:

The desired end user is an individual and not a DAO or protocol team.

You can't expect a normal end user to react that fast to save his wallet. The core promise of Kleidi is that if the signer keys are compromised you can safely recover your wallet. The timelock duration can be set up to 30 days. It is expected that if the user reacts in days or even weeks after the initial attack he can still save his wallet. Meanwhile, as I showed with my calculations on the submission #5, the attacker can bring the gas costs in the millions of dollars in a matter of hours.

This attack will result in a complete loss of funds for the users and should be a high severity issue.

DemoreXTess · 2024-11-05T11:30:40Z

I don't think it can be a high severity issue. The cost of the attack vector is extremely high. No one ( Attacker ) can take a risk and assume the victim's 95% of the funds is in the wallet. If attacker assumes that and take action based on that, it will be a gambling for the attacker. It can be medium severity issue at most.

We can only validate the issue as DoS because pause functionality is impacted here for guardian role.

Med: Assets not at direct risk, but the function of the protocol or its availability could be impacted, or leak value with a hypothetical attack path with stated assumptions, but external requirements.

gesha17 · 2024-11-05T16:00:55Z

I don't think it can be a high severity issue. The cost of the attack vector is extremely high. No one ( Attacker ) can take a risk and assume the victim's 95% of the funds is in the wallet. If attacker assumes that and take action based on that, it will be a gambling for the attacker. It can be medium severity issue at most.

We can only validate the issue as DoS because pause functionality is impacted here for guardian role.

Med: Assets not at direct risk, but the function of the protocol or its availability could be impacted, or leak value with a hypothetical attack path with stated assumptions, but external requirements.

The attack cost will be high only on mainnet. On L2s the attacker can submit more messages than the victim has time to cancel. Note that the current implementation only allows canceling individual entries, so the user will have to cancel transactions individually. And since he is just a normal end user, we can't assume a sophisticated approach in a limited timeframe.

Suppose a scenario where the victim has 24 hours to cancel the malicios transcations before the timelock delay passed and the attacker takes over his wallet, but there are 100 thousand transactions to cancel. There is no way the user can cancel all of them. This would be valid both on mainnet and L2s.

Also, the DoS is really a side effect and doesnt make much of a difference in this scenario. Considering the fix that the developers implemented, the role of pauser guardian becomes somewhat redundant as long as the users have proper configuration of delays. See my comment on #5 for elaboration on that.

DemoreXTess · 2024-11-06T12:59:55Z

Gas price is not important here, there is no difference between L1 and L2 in this attack vector. Defence cost will decrease as attack cost decrease.

Furthermore, in defence part we don't need to race against attacker. We can directly use recovery spell and then we can clear proposals until pause() gas cost reach a healthy level and then pause() can clear all the malicious proposals.

gesha17 · 2024-11-06T14:06:26Z

Gas price is not important here, there is no difference between L1 and L2 in this attack vector. Defence cost will decrease as attack cost decrease.

Furthermore, in defence part we don't need to race against attacker. We can directly use recovery spell and then we can clear proposals until pause() gas cost reach a healthy level and then pause() can clear all the malicious proposals.

I think you are confusing this attack with the simpler DoS vector. The whole point is you cant "clear proposals until pause() gas cost reach a healthy level". That is possible only in the simpler DoS vector if the user configuration is correct, but not here.

DemoreXTess · 2024-11-06T17:13:21Z

I understood your point. I want to express my arguments why it shouldn't be classified different than DoS:

Root cause is OOG and DoS in pause()
Attack needs multiple blocks to cause DoS in pause()
We can't assume guardian is inactive here, because this will be unreasonable to assume it, it makes guardian pointless here.
Let say we assumed guardian was inactive while attack, after recovery spell, user can clear the proposals. You can say it's not possible, I understood your point but again you can't assume user will do that one by one, he can clear by external tools like Multicall3 or he can choose another way to do in batch.

Another note: I believe we can't assume guardian is inactive as I stated in argument 3. So, I added my 4th argument for just clarification.

thebrittfactor · 2024-11-13T15:20:58Z

C4 staff have added the satisfactory label for awarding needs.

howlbot-integration bot added a commit that referenced this issue Oct 27, 2024

Allarious issue #24

c0eb85e

howlbot-integration bot closed this as completed Oct 27, 2024

c4-judge added duplicate-4 and removed duplicate-5 labels Oct 30, 2024

c4-judge removed the duplicate-4 label Oct 30, 2024

c4-judge mentioned this issue Oct 30, 2024

Guardian can be DoSed by submitting many transactions at once #4

Closed

c4-judge reopened this Oct 30, 2024

c4-judge added primary issue Highest quality submission among a set of duplicates selected for report This submission will be included/highlighted in the audit report labels Oct 30, 2024

c4-judge added 2 (Med Risk) Assets not at direct risk, but function/availability of the protocol could be impacted or leak value downgraded by judge Judge downgraded the risk level of this issue and removed 3 (High Risk) Assets can be stolen/lost/compromised directly labels Oct 30, 2024

gesha17 mentioned this issue Nov 5, 2024

Attackers that have compromised signer keys can financially overwhelm the protocol to take it over #5

Closed

C4-Staff added the M-01 label Nov 13, 2024

thebrittfactor added the satisfactory satisfies C4 submission criteria; eligible for awards label Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gas griefing/attack via creating the proposals #24

Gas griefing/attack via creating the proposals #24

howlbot-integration bot commented Oct 27, 2024

c4-judge commented Oct 30, 2024

c4-judge commented Oct 30, 2024

GalloDaSballo commented Oct 30, 2024

c4-judge commented Oct 30, 2024

GalloDaSballo commented Oct 30, 2024

GalloDaSballo commented Oct 30, 2024

ElliotFriedman commented Oct 30, 2024

ElliotFriedman commented Oct 30, 2024

GalloDaSballo commented Oct 31, 2024

gesha17 commented Nov 5, 2024 •

edited

Loading

DemoreXTess commented Nov 5, 2024

gesha17 commented Nov 5, 2024

DemoreXTess commented Nov 6, 2024

gesha17 commented Nov 6, 2024 •

edited

Loading

DemoreXTess commented Nov 6, 2024

thebrittfactor commented Nov 13, 2024

Gas griefing/attack via creating the proposals #24

Gas griefing/attack via creating the proposals #24

Comments

howlbot-integration bot commented Oct 27, 2024

Lines of code

Vulnerability details

Explanation

Impact

Proof of Concept

Recommended Mitigation Steps

Assessed type

c4-judge commented Oct 30, 2024

c4-judge commented Oct 30, 2024

GalloDaSballo commented Oct 30, 2024

c4-judge commented Oct 30, 2024

GalloDaSballo commented Oct 30, 2024

GalloDaSballo commented Oct 30, 2024

ElliotFriedman commented Oct 30, 2024

ElliotFriedman commented Oct 30, 2024

GalloDaSballo commented Oct 31, 2024

gesha17 commented Nov 5, 2024 • edited Loading

DemoreXTess commented Nov 5, 2024

gesha17 commented Nov 5, 2024

DemoreXTess commented Nov 6, 2024

gesha17 commented Nov 6, 2024 • edited Loading

DemoreXTess commented Nov 6, 2024

thebrittfactor commented Nov 13, 2024

gesha17 commented Nov 5, 2024 •

edited

Loading

gesha17 commented Nov 6, 2024 •

edited

Loading