Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements for TLS with I/O threads #1271

Open
wants to merge 1 commit into
base: unstable
Choose a base branch
from

Conversation

uriyage
Copy link
Contributor

@uriyage uriyage commented Nov 7, 2024

Main thread profiling revealed significant overhead in TLS operations, even with read/write offloaded to I/O threads:

Perf results:

10.82% 8.82% valkey-server libssl.so.3 [.] SSL_pending # Called by main thread after I/O completion

10.16% 5.06% valkey-server libcrypto.so.3 [.] ERR_clear_error # Called for every event regardless of thread handling

This commit further optimizes TLS operations by moving more work from the main thread to I/O threads:

Improve TLS offloading to I/O threads with two main changes:

  1. Move ERR_clear_error() calls closer to SSL operations

    • Currently, error queue is cleared for every TLS event
    • Now only clear before actual SSL function calls
    • This prevents unnecessary clearing in main thread when operations
      are handled by I/O threads
  2. Optimize SSL_pending() checks

    • Add TLS_CONN_FLAG_HAS_PENDING flag to track pending data
    • Move pending check to follow read operations immediately
    • I/O thread sets flag when pending data exists
    • Main thread uses flag to update pending list

Performance improvements:
Testing setup based on https://valkey.io/blog/unlock-one-million-rps-part2/

Before:

  • SET: 896,047 ops/sec
  • GET: 875,794 ops/sec

After:

  • SET: 985,784 ops/sec (+10% improvement)
  • GET: 1,066,171 ops/sec (+22% improvement)

Copy link

codecov bot commented Nov 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.68%. Comparing base (1c18c80) to head (62adf80).
Report is 13 commits behind head on unstable.

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #1271   +/-   ##
=========================================
  Coverage     70.68%   70.68%           
=========================================
  Files           114      114           
  Lines         63150    63150           
=========================================
+ Hits          44637    44639    +2     
+ Misses        18513    18511    -2     
Files with missing lines Coverage Δ
src/tls.c 100.00% <ø> (ø)

... and 13 files with indirect coverage changes

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I'd like @madolson to take a look too.

Impressing performance result, and this is with more lines deleted than added!

Does the change have any performance impact without I/O threads enabled?

@madolson
Copy link
Member

madolson commented Nov 8, 2024

Will check tomorrow, but @uriyage please fix the DCO when you get a chance.

@ranshid
Copy link
Member

ranshid commented Nov 10, 2024

Looks good to me, but I'd like @madolson to take a look too.

Impressing performance result, and this is with more lines deleted than added!

Does the change have any performance impact without I/O threads enabled?

I would be surprised if there was any performance improvement when io-threads are not used. This will actually cause SSL_pending and ERR_clear_error to be called multiple times in some scenarios where we are reading multiple times on the read handler. BUT I think those cases are rare and AFAIK are mostly limited to cases like the connSyncReadLine.

ERR_clear_error();
int ret = SSL_read(conn->ssl, ptr, size);
ret = updateStateAfterSSLIO(conn, ret, 0);
int ret = connTLSRead(conn_, ptr, size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will cause us to potentially keep updating the event in updateStateAfterSSLIO. Same for the connTLSSyncReadLine. Maybe you want to create an internal function which will also get the update_event indication?

Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than Ran's comment, this looks good to me. Nice perf results!

@madolson madolson added performance release-notes This issue should get a line item in the release notes pending-missing-dco For PRs that are blocked because they are missing a dco labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending-missing-dco For PRs that are blocked because they are missing a dco performance release-notes This issue should get a line item in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants