Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multithreading feature performance issues #1352

Open
Vortetty opened this issue Sep 19, 2024 · 9 comments
Open

multithreading feature performance issues #1352

Vortetty opened this issue Sep 19, 2024 · 9 comments
Labels

Comments

@Vortetty
Copy link

Describe the bug
Issue is present on a cachyos, ubuntu, and gentoo system. untested on windows/mac.
In testing, it appears for short running programs that call sysinfo, the rayon overhead can double or triple runtime of a program, which is undocumented as is the multithreading option. All testing was done in release mode on version 0.31.4 with nothing except a terminal running, and benchmarked using hyperfine.

multithreading Disabled:

Benchmark 1: target/release/yatfpbnws
  Time (mean ± σ):      52.3 ms ±   2.6 ms    [User: 12.5 ms, System: 39.2 ms]
  Range (min … max):    49.3 ms …  60.9 ms    100 runs

multithreading Enabled:

Benchmark 1: target/release/yatfpbnws
  Time (mean ± σ):     226.2 ms ±  20.8 ms    [User: 35.6 ms, System: 147.3 ms]
  Range (min … max):   179.4 ms … 286.4 ms    100 runs

To Reproduce
https://github.com/Vortetty/YATFPBNWS/tree/master
Changing sysinfo to use default features results in a runtime increase of 4x in the above example on my main system (9900x/32gb ram)
example only runs on linux systems, and may not work on all of them as it is still being worked on, but seems to work on my 3 test systems fine.

@Vortetty Vortetty added the bug label Sep 19, 2024
@GuillaumeGomez
Copy link
Owner

That is pretty bad indeed. Now I need some extra info: which API are you using and with which arguments? The idea would be to only trigger multi-threading above a given threshold.

@Vortetty
Copy link
Author

Vortetty commented Sep 20, 2024

Total counts of all calls made are:

  • 3 calls to
    • sys.process()
    • Pid::from_u32()
  • 2 calls each to:
    • System::host_name()
  • 1 call each to:
    • System::distribution_id()
    • System::cpu_arch()
    • System::kernel_version()
    • System::uptime()
    • let users = Users::new_with_refreshed_list();
    • let sys = System::new_with_specifics(RefreshKind::new().with_processes(ProcessRefreshKind::everything()));
    • users.get_user_by_id()
    • Process.user_id()
    • users.first()

edits: missed a few calls

@Vortetty
Copy link
Author

one thing i did notice last night testing, it seems that on my weaker/slower/less beefy laptop, the time difference is alot less drastic
the beefier the system the more drastic the time difference is, perhaps the thread creation overhead is longer than the syscalls are in total unless doing full refreshes on slow systems

@GuillaumeGomez
Copy link
Owner

The only API here using multi-threading is System::new_with_specifics(RefreshKind::new().with_processes(ProcessRefreshKind::everything()));. One question about your code: are you creating System more than once?

@Vortetty
Copy link
Author

Vortetty commented Sep 20, 2024

i am not, it's only instanced once in the whole program, and only refreshed on creation

@GuillaumeGomez
Copy link
Owner

I'm starting to be out of ideas. 😅

Is the total runtime of your program faster or slower with multithreading enabled? It's normal that it uses more resources and system time but overall, it should still run faster (hopefully).

@Vortetty
Copy link
Author

Vortetty commented Sep 20, 2024

It's slower with the multithreading (as seen in the benchmarks above)
overall disabling it saves 150ms on the program runtime on my main rig, i can test on a secondary more average pc here in a bit to make sure that holds up. in the benchmarks above you can also see with multithreading on there's 20ms extra spent in user mode (likely from the overhead of rayon) as well as the 110ms extra spent in system calls (likely threads, waiting for their creation and destruction)
in longer-running programs it would definitely be less noticeable, especially if you made the calls super often since rayon uses work stealing to prevent re-initializing things every time, but that would need a very specific call order and calls that it can perform work stealing on effectively, which it may not be able to with simple system calls.

@Vortetty
Copy link
Author

Vortetty commented Sep 22, 2024

running the same test on an hp laptop with a ryzen 5300U and crystal linux, it's 173.7ms with threading, and 69.6ms without (both averaged over 100 runs). so the threading slowdown can be reproduced even on low power systems

@Vortetty
Copy link
Author

Vortetty commented Sep 23, 2024

windows with multithread: 154.9ms
windows without multithread: 162.8ms

seems to be a linux specific issue. all that is needed for the test is:

[package]
name = "multithreadtest"
version = "0.1.0"
edition = "2021"

[dependencies]
sysinfo = {version="0.31.4", default-features=false, features=["component", "disk", "network", "system", "user"]}
use sysinfo::{ProcessRefreshKind, RefreshKind, System};

fn main() {
    let sys = System::new_with_specifics(RefreshKind::new().with_processes(ProcessRefreshKind::everything()));
    println!("{:?}", sys);
}

on android through termux it seems to not matter which is used, may change if used in an app
if anyone with mac can test to see as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants