Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(transport/recovery): optimize SentPackets::take_ranges #2245

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

mxinden
Copy link
Collaborator

@mxinden mxinden commented Nov 24, 2024

SentPackets keep track of all inflight packets. On receiving an ACK, the acked
packet ranges are removed via SentPackets:take_ranges.

In normal network scenarios one can assume that the amount of packets before a
range is small (e.g. due to reordering) or zero, while the amount of packets
after a range is large, i.e. all remaining in-flight packets.

The previous implementation assumed the opposite, leading to an overhead linear
to the amount of packets after the range. This commit changes the algorithm such
that it is linear to the amount of packets before the range, which again, is
assumed to be smaller than the amount of packets after the acked range.

Fixes #2242.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this benchmark is approved, I will split out the benchmark into a separate pull request. Merging that second pull request first will give us a baseline (i.e. main branch) before merging this pull request.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request merging benchmark only: #2253

`SentPackets` keep track of all inflight packets. On receiving an ACK, the acked
packet ranges are removed via `SentPackets:take_ranges`.

In normal network scenarios one can assume that the amount of packets before a
range is small (e.g. due to reordering) or zero, while the amount of packets
after a range is large, i.e. all remaining in-flight packets.

The previous implementation assumed the opposite, leading to an overhead linear
to the amount of packets after the range. This commit changes the algorithm such
that it is linear to the amount of packets before the range, which again, is
assumed to be smaller than the amount of packets after the acked range.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running the benchmark above, the patch below shows a 38x improvement over current main.

$ critcmp main patch                                 
group                       main                                   patch
-----                       ----                                   -----
SentPackets::take_ranges    38.46  156.8±13.11µs        ? ?/sec    1.00      4.1±1.20µs        ? ?/sec
$ cat /proc/cpuinfo                                                                                          
model name      : 12th Gen Intel(R) Core(TM) i9-12900HK

Copy link

codecov bot commented Nov 24, 2024

Codecov Report

Attention: Patch coverage is 94.11765% with 2 lines in your changes missing coverage. Please review.

Project coverage is 95.37%. Comparing base (f3d0191) to head (d7454d0).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
neqo-transport/src/recovery/sent.rs 94.11% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2245      +/-   ##
==========================================
- Coverage   95.37%   95.37%   -0.01%     
==========================================
  Files         112      112              
  Lines       36569    36593      +24     
==========================================
+ Hits        34879    34900      +21     
- Misses       1690     1693       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link

github-actions bot commented Nov 24, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

neqo-transport/benches/sent_packets.rs Show resolved Hide resolved
neqo-transport/src/recovery/sent.rs Outdated Show resolved Hide resolved
Comment on lines 215 to 223
// According to RFC 9000 19.3.1 ACK ranges are in descending order:
//
// > Each ACK Range consists of alternating Gap and ACK Range Length
// > values in **descending packet number order**.
//
// <https://www.rfc-editor.org/rfc/rfc9000.html#section-19.3.1>
//
// Thus none of the following ACK ranges will acknowledge packets in
// `after_acked_range`. Let's put those back early.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you are relying on this, you should therefore make some assertions to guarantee that it is true. debug_assert!() is fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Done in d7454d0.

}

// Put remaining non-acked packets back.
self.packets.extend(packets);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be changed to the following as well?

            if self.packets.is_empty() {
                self.packets = after_acked_range;
            } else {
                self.packets.extend(after_acked_range);
            }

If the first range is contiguous with the end, then this might be faster. It's not clear to me whether the additional condition is worth it though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you think of a scenario where:

  1. the range is contiguous with the end
  2. AND the number of packets before the range (here packets) is high?

I can only think of the following contrived scenario. Say that we have 1_000 packets in flight. Say that we receive an ACK range for packet 990 - 1_000, thus packets would be very large and self.packets.extend(packets) costly. But I would argue that in such case, self.packets.extend(packets) is the least important problem we have to worry about.

The above aside, I assume packets will always be small, thus self.packets = packets would not be much faster than self.packets.extend(packets), and thus I don't think the above conditional on self.packets.is_empty() is worth it.

Agreed @martinthomson?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's worth noting. A brief note to that effect might be good.

// <https://www.rfc-editor.org/rfc/rfc9000.html#section-19.3.1>
if let Some(prev_start) = previous_range_start {
debug_assert!(
*range.end() < prev_start,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that ACK ranges are encoded on the wire relative to the previous range they can not overlap.

Given that range is inclusive, its length can not be zero.

Thus we need a < and not <=.

Please correct me if I am missing something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would debug_assert!(prev_start.is_none_or(|s| s > *range.end())) work better?

I don't think that you need a big print statement: it will never happen, except in a test.

Copy link
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Land the other first, but this is good to go.

// <https://www.rfc-editor.org/rfc/rfc9000.html#section-19.3.1>
if let Some(prev_start) = previous_range_start {
debug_assert!(
*range.end() < prev_start,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would debug_assert!(prev_start.is_none_or(|s| s > *range.end())) work better?

I don't think that you need a big print statement: it will never happen, except in a test.

}

// Put remaining non-acked packets back.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Put remaining non-acked packets back.
// Put remaining non-acked packets back.
// This is inefficient if there the acknowledged packets include the last
// sent packet AND there is a large unacknowledged span of packets.
// That's rare enough that we won't do anything special for that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize SentPackets::take_ranges
2 participants