Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show tasks duration and resource usage metrics in test_cluster_performance output #5390

Open
Selutario opened this issue May 15, 2024 · 1 comment

Comments

@Selutario
Copy link
Contributor

Description

We need to modify the test below:

It fails when any of the cluster stats (task duration or resource usage) exceeds a predefined threshold. However, it may be helpful to review what those stats really are even if the test does not fail, so that we can detect slight increases in some of the metrics.

To make this easier, the test should print (and include in the report) the detailed metrics that it uses internally. For example:

>>> from wazuh_testing.tools.performance.csv_parser import ClusterCSVTasksParser
>>> ClusterCSVTasksParser('/home/selu/Descargas/cluster_performance/517/artifacts_480_rc1').get_stats()
{
    "setup_phase": {
        "integrity_check": {
            "time_spent(s)": {
                "workers": {
                    "mean": ("worker_17", 0.3481111111111111),
                    "max": ("worker_14", 3.176),
                },
                "master": {
                    "mean": ("master", 0.05240245824141191),
                    "max": ("master", 0.709),
                },
            }
        },
        "integrity_sync": {
            "time_spent(s)": {
                "workers": {
                    "mean": ("worker_8", 0.04211764705882353),
                    "max": ("worker_23", 0.163),
                },
                "master": {
                    "mean": ("master", 0.5421203007518796),
                    "max": ("master", 3.217),
                },
            }
        },
        "agent-info_sync": {
            "time_spent(s)": {
                "workers": {
                    "mean": ("worker_18", 0.9509827586206897),
                    "max": ("worker_9", 10.639),
                },
                "master": {
                    "mean": ("master", 0.687005693950178),
                    "max": ("master", 10.257),
                },
            }
        },
    },
    "stable_phase": {
        "integrity_check": {
            "time_spent(s)": {
                "workers": {
                    "mean": ("worker_3", 0.01140740740740741),
                    "max": ("worker_3", 0.04),
                },
                "master": {
                    "mean": ("master", 0.00456888888888889),
                    "max": ("master", 0.017),
                },
            }
        },
        "agent-info_sync": {
            "time_spent(s)": {
                "workers": {"mean": ("worker_18", 0.00964), "max": ("worker_18", 0.025)}
            }
        },
    },
}
@Selutario
Copy link
Contributor Author

We should also decrease these task thresholds:

setup_phase:
agent-info_sync:
time_spent(s):
master:
max: 31
mean: 3.1
workers:
max: 50
mean: 8
integrity_check:
time_spent(s):
master:
max: 50
mean: 8.3
workers:
max: 55
mean: 13.5
integrity_sync:
time_spent(s):
master:
max: 54
mean: 11
workers:
max: 22
mean: 3.2
stable_phase:
agent-info_sync:
time_spent(s):
master:
max: 5
mean: 1
workers:
max: 8.5
mean: 3.3
integrity_check:
time_spent(s):
master:
max: 6.5
mean: 3
workers:
max: 10
mean: 6

Artifacts like the ones attached here should be making the test fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant