Torch timer calibration fails with large min_duration values. #526

EchoStone1101 · 2024-12-01T10:47:13Z

Setting large min_duration values (say, 100us) with log_torch=True triggers a "Torch timer calibration failed" error:

Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from viztracer import VizTracer
>>> t = VizTracer(min_duration=100, log_torch=True)
STAGE:2024-12-01 10:41:10 307114:307114 ActivityProfilerController.cpp:314] Completed Stage: Warm Up
STAGE:2024-12-01 10:41:10 307114:307114 ActivityProfilerController.cpp:320] Completed Stage: Collection
STAGE:2024-12-01 10:41:10 307114:307114 ActivityProfilerController.cpp:324] Completed Stage: Post Processing
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/python_venv/vllm/lib/python3.10/site-packages/viztracer/viztracer.py", line 96, in __init__
    self.calibrate_torch_timer()
  File "/home/python_venv/vllm/lib/python3.10/site-packages/viztracer/viztracer.py", line 227, in calibrate_torch_timer
    raise RuntimeError("Torch timer calibration failed")  # pragma: no cover
RuntimeError: Torch timer calibration failed

I'm quite sure this is because the calibration mechanism uses lightweight torch operations, which are however filtered with non-default min_duration setting:

def calibrate_torch_timer(self, force=False):
    if self.enable:
        raise RuntimeError("You can't calibrate torch timer while tracer is running")
    if self.torch_offset and not force:
        return
    import json
    import tempfile
    import torch  # type: ignore
    from torch.profiler import profile, supported_activities  # type: ignore
    verbose = self.verbose
    # Silent the tracer during calibration
    self.verbose = 0
    with profile(activities=supported_activities()) as prof:
        _VizTracer.start(self)
        for _ in range(20):
            torch.empty(100) # <================ This is filtered
        _VizTracer.stop(self)

A simple fix would be to temporarily set min_duration to zero during calibration.

The text was updated successfully, but these errors were encountered:

gaogaotiantian · 2024-12-01T16:55:11Z

Upgrade to v1.0.0 :)

gaogaotiantian closed this as not planned Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch timer calibration fails with large min_duration values. #526

Torch timer calibration fails with large min_duration values. #526

EchoStone1101 commented Dec 1, 2024

gaogaotiantian commented Dec 1, 2024

Torch timer calibration fails with large min_duration values. #526

Torch timer calibration fails with large min_duration values. #526

Comments

EchoStone1101 commented Dec 1, 2024

gaogaotiantian commented Dec 1, 2024