Skip to content

Commit

Permalink
Jagged tensor micro-benchmarks (pytorch#3156)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: pytorch#3156

X-link: facebookresearch/FBGEMM#250

- Add jagged tensor micro-benchmarks

```
(foo) bash-5.1$ python -W ignore jagged_tensor_benchmark.py device --embedding-dim 512
INFO:root:######## Jagged (2D) to Dense ########
INFO:root:FBGEMM JaggedTensor: 5.746198445558548e-05 sec 438.11657809101143 GB/s
INFO:root:PyTorch NestedTensor: 6.370197981595993e-05 sec 395.1842010676863 GB/s
INFO:root:
INFO:root:######## Dense to Jagged (2D) ########
INFO:root:FBGEMM JaggedTensor: 3.12004815787077e-05 sec 806.880109734599 GB/s
INFO:root:PyTorch NestedTensor: 0.0014418727159500122 sec 17.459249850229323 GB/s
INFO:root:
INFO:root:######## Jagged (x) Dense -> Jagged ########
INFO:root:(+) FBGEMM JaggedTensor: 4.031049832701683e-05 sec 624.9347699856205 GB/s
INFO:root:(+) PyTorch NestedTensor: 0.001540895700454712 sec 16.348564015439923 GB/s
INFO:root:(*) FBGEMM JaggedTensor: 4.03628796339035e-05 sec 624.1237550068162 GB/s
INFO:root:(*) PyTorch NestedTensor: 0.0015746270418167114 sec 15.998348390445281 GB/s
INFO:root:
INFO:root:######## Jagged + Dense + Dense -> Jagged ########
INFO:root:FBGEMM JaggedTensor: 5.2013471722602845e-05 sec 645.7602403302756 GB/s
INFO:root:PyTorch NestedTensor: 0.0028932960033416747 sec 11.608985724656774 GB/s
INFO:root:
INFO:root:######## Jagged (1D) to Dense ########
INFO:root:FBGEMM JaggedTensor: 1.526080071926117e-05 sec 6.511322821651443 GB/s
INFO:root:PyTorch NestedTensor: 3.976528346538544e-05 sec 2.4729108264901147 GB/s
INFO:root:
INFO:root:######## Dense to Jagged (1D) ########
INFO:root:FBGEMM JaggedTensor: 1.5250975266098977e-05 sec 6.51551774665078 GB/s
INFO:root:PyTorch NestedTensor: 0.0014563246965408326 sec 0.06752340342340878 GB/s
INFO:root:
(foo) bash-5.1$
```

Reviewed By: spcyppt

Differential Revision: D59973955

fbshipit-source-id: 758abe0c5bb7fb87e772764e22d3a06e47b0107b
  • Loading branch information
q10 authored and facebook-github-bot committed Sep 20, 2024
1 parent f173613 commit f305522
Show file tree
Hide file tree
Showing 2 changed files with 305 additions and 60 deletions.
6 changes: 4 additions & 2 deletions fbgemm_gpu/bench/bench_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ def benchmark_torch_function( # noqa: C901
f,
# pyre-fixme[2]: Parameter must be annotated.
args,
# pyre-fixme[2]: Parameter must be annotated.
kwargs={},
flush_gpu_cache_size_mb: int = 40,
iters: int = 10,
num_warmups: int = 2,
Expand All @@ -43,11 +45,11 @@ def benchmark_torch_function( # noqa: C901
num_threads: int = 1,
copy_f_for_multi_thread_test: bool = False,
) -> Tuple[float, torch.Tensor]:
logging.info(f"Start to benchmark {name}...")
logging.debug(f"Start to benchmark {name}...")
if device != "cpu" and device != "" and device != "cuda":
torch.cuda.set_device(device)
for _ in range(num_warmups):
output = f(*args)
output = f(*args, **kwargs)

assert num_threads > 0
if device != "cpu" and torch.cuda.is_available() and (num_threads == 1):
Expand Down
Loading

0 comments on commit f305522

Please sign in to comment.