Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#5843 scheduler: Only consider long-latency syscalls blocking #6458

Merged
merged 4 commits into from
Nov 16, 2023

Conversation

derekbruening
Copy link
Contributor

@derekbruening derekbruening commented Nov 16, 2023

Rather than context switching on every syscall labeled maybe-blocking, the scheduler uses the now-available syscall latency to decide whether the syscall should block and result in a context switch.

Adds two new command line options, -sched_syscall_switch_us (default 500us) and -sched_blocking_switch_us (default 100us), and corresponding scheduler_t inputs, to control the latency thresholds. To avoid relying too much on the maybe-blocking labels, we do consider a very-high-latency syscall not marked as maybe-blocking to block.

Adds a new schedule_stats unit test.

Tested in a large proprietary app where this reduces the context switch rate from ~100x too high down to ~10x too high. The next step of adding i/o wait times should further improve the representativeness.

Issue: #5843

Rather than context switching on every syscall labeled maybe-blocking,
the scheduler uses the now-available syscall latency to decide whether
the syscall should block and result in a context switch.

Adds two new command line options, -sched_syscall_switch_us (default
500us) and -sched_blocking_switch_us (default 100us), and
corresponding scheduler_t inputs, to control the latency thresholds.
To avoid relying too much on the maybe-blocking labels, we do consider
a very-high-latency syscall not marked as maybe-blocking to block.

Adds a new unit test.

Tested in a large proprietary app where this reduces the context
switch rate from ~100x too high down to ~10x too high.  The next step
of adding i/o wait times should further improve the
representativeness.

Issue: #5843
@derekbruening
Copy link
Contributor Author

ub22 is off.simple #6416

@derekbruening
Copy link
Contributor Author

win64 is traceopts #6423 and replaceall #5412 known flakes

clients/drcachesim/CMakeLists.txt Show resolved Hide resolved
clients/drcachesim/common/options.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/schedule_stats.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/schedule_stats.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/schedule_stats.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.h Outdated Show resolved Hide resolved
@derekbruening
Copy link
Contributor Author

The 2 failures are the AMD #6416 and #6417

@derekbruening derekbruening merged commit 5990405 into master Nov 16, 2023
13 of 15 checks passed
@derekbruening derekbruening deleted the i5843-sched-sys-latency branch November 16, 2023 21:17
derekbruening added a commit that referenced this pull request Nov 17, 2023
Fixes a < assert from PR #6458 to be <=, to allow the pre-syscall
timestamp to equal the post-syscall timestamp.

Adds a test that fails without the fix.

Issue: #5843
derekbruening added a commit that referenced this pull request Nov 17, 2023
Fixes a < assert from PR #6458 to be <=, to allow the pre-syscall
timestamp to equal the post-syscall timestamp.

Adds a test that fails without the fix.

Issue: #5843
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants