TensorFlow profiler running into OOM issue on GPU #655

rahul-fnu · 2023-08-10T04:08:10Z

Running TensorFlow profiler for longer than 10 second period results into OOM error, crashes the inference process and the profiler returns DEADLINE_EXCEEDED. Is there anyway to limit the sampling rate or way to reduce the amount of information being collected to avoid crashing the process?

Here is the code that I run:
tensorflow_profiler.experimental.client("grpc://localhost:3222", "profiles", 30000)

The text was updated successfully, but these errors were encountered:

ndeepesh · 2023-08-11T18:30:16Z

Hi Tensorflow team

Can you help us with above? Is there a way to sample TensorFlow profiling on GPUs? This is blocking us from collecting any traces greater than 10s

Rahulraj0308 · 2024-02-07T18:30:18Z

@rahul-fnu To limit the sampling rate or reduce the amount of information collected by the TensorFlow profiler, you can adjust the sampling_rate parameter in the tensorflow_profiler.experimental.client function.
Use- tensorflow_profiler.experimental.client("grpc://localhost:3222", "profiles", 30000, sampling_rate=0.5, events=["compute"])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorFlow profiler running into OOM issue on GPU #655

TensorFlow profiler running into OOM issue on GPU #655

rahul-fnu commented Aug 10, 2023 •

edited

Loading

ndeepesh commented Aug 11, 2023

Rahulraj0308 commented Feb 7, 2024

TensorFlow profiler running into OOM issue on GPU #655

TensorFlow profiler running into OOM issue on GPU #655

Comments

rahul-fnu commented Aug 10, 2023 • edited Loading

ndeepesh commented Aug 11, 2023

Rahulraj0308 commented Feb 7, 2024

rahul-fnu commented Aug 10, 2023 •

edited

Loading