Skip to content

Useful PAPI Events for Profiling Mini Apps with Traces

Jeffrey Young edited this page Apr 4, 2019 · 6 revisions

_Last updated: 4/3/2019

To use PAPI with Spatter

Check out the "papi" branch and build using the configure script "configure_omp_intel_papi". Then you will need to copy the papi_config.txt file to your build directory and add your number of events (up to 4) and their hex codes. Once you run spatter, papi_output.txt will contain the output measured by PAPI.

#Copy over and edit the config file from util
[build_omp_intel_papi]$ more papi_config.txt
3
PAPI_L1_DCM 0x80000000
PAPI_L2_DCM 0x80000002
PAPI_L3_TCM 0x80000008
#Run spatter as usual
[build_omp_intel_papi]$ ./spatter -s 1 -l 10000 --runs 3
Warning: No backend specified, guessing OpenMP
Warning: Kernel unspecified, guess GATHER
Warning: Kernel file unspecified, guessing kernels/kernels_vector.cl
3 events
PAPI_L1_DCM: 80000000
PAPI_L2_DCM: 80000002
PAPI_L3_TCM: 80000008
backend kernel op time source_size target_size idx_size bytes_moved usable_bandwidth actual_bandwidth omp_threads vector_len block_dim
OPENMP GATHER COPY 0.000058 80000 80000 80000 160000 2751.788663 4127.682994 24 1 1 0
OPENMP GATHER COPY 0.000070 80000 80000 80000 160000 2288.591372 3432.887058 24 1 1 0
OPENMP GATHER COPY 0.000058 80000 80000 80000 160000 2760.715025 4141.072538 24 1 1 0
#Check the output file for results
[build_omp_intel_papi]$ more papi_output.txt
767 174 118
822 199 284
532 65 50
505 114 91

PAPI Events

The preset events can be found here and formatted here.

Metrics Used to Evaluate Spatter

Cache and TLB Misses

  • PAPI_L1_DCM - L1 data misses
  • PAPI_L2_DCM - L2 data misses
  • PAPI_L3_DCM - L3 data misses
  • PAPI_TLB_DM - TLB data misses
  • PAPI_TLB_TL - total TLB misses

Cycles

  • PAPI_FUL_CCY - cycles with max instructions
  • PAPI_TOT_CYC - total cycles

Total instruction count and instructions retired

  • PAPI_TOT_IIS - total instructions issued
  • PAPI_TOT_INS - total instructions completed
  • PAPI_VEC_INS - vector instructions completed

Scalar load count

  • PAPI_LD_INS - total load instructions

S/G instruction count

  • PAPI_VEC_SP - single-precision vector instructions
  • PAPI_VEC_DP - double-precision vector instructions

Likwid arch-specific counters

Broadwell Skylake