-
Notifications
You must be signed in to change notification settings - Fork 136
Experiences on using llvm exegesis
In this page, I would like to write my small experiences on using llvm-exegesis.
It is a small tool pertaining to llvm. According to the description page, it is a benchmarking tool for measuring host machine instruction characteristics like latency, throughput, or port decomposition. This is the github project page, and here is a set of slides that looks like the best explanation of the tool.
I would like to share my experiences, since I could not find any report on using this tool.
I mainly use Ubuntu OS for code development. I have computers on which Ubuntu 18.04 is installed. llvm-exegesis is automatically installed when llvm is installed via apt-get. However, it refuses to run saying "LLVM ERROR: cannot initialize libpfm"
So, I decided to build llvm-8.0.0 myself. After building and installing llvm-8.0.0 to the local hard drive, it started working. It runs with root privilege.
# /opt/bin/llvm-exegesis -mode=latency -opcode-name=ADD64rr
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-25e0de.o
---
mode: latency
key:
instructions:
- 'ADD64rr RDX RDX R12'
config: ''
register_initial_values:
- 'RDX=0x0'
- 'R12=0x0'
cpu_name: skylake-avx512
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: latency, value: 1.0052, per_snippet_value: 1.0052 }
error: ''
info: Repeating a single implicitly serial instruction
assembled_snippet: 415448BA000000000000000049BC00000000000000004C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E2415CC3
...
# echo "vzeroupper" | /opt/bin/llvm-exegesis -mode=uops -snippets-file=-
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-433f2e.o
---
mode: uops
key:
instructions:
- 'VZEROUPPER'
config: ''
register_initial_values: []
cpu_name: skylake-avx512
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: SKXPort0, value: 0.0005, per_snippet_value: 0.0005 }
- { key: SKXPort1, value: 0.0017, per_snippet_value: 0.0017 }
- { key: SKXPort2, value: 0.001, per_snippet_value: 0.001 }
- { key: SKXPort3, value: 0.0016, per_snippet_value: 0.0016 }
- { key: SKXPort4, value: 0.0013, per_snippet_value: 0.0013 }
- { key: SKXPort5, value: 0.001, per_snippet_value: 0.001 }
- { key: SKXPort6, value: 0.0026, per_snippet_value: 0.0026 }
- { key: SKXPort7, value: 0.0009, per_snippet_value: 0.0009 }
- { key: NumMicroOps, value: 4.0084, per_snippet_value: 4.0084 }
error: ''
info: ''
assembled_snippet: C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C3
...
# cat exegesistest1.txt
# LLVM-EXEGESIS-LIVEIN RDI
# LLVM-EXEGESIS-DEFREG XMM1 42
vmulps (%rdi), %xmm1, %xmm2
vhaddps %xmm2, %xmm2, %xmm3
addq $0x10, %rdi
# /opt/bin/llvm-exegesis -mode=uops -snippets-file=./exegesistest1.txt
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-6f88de.o
---
mode: uops
key:
instructions:
- 'VMULPSrm XMM2 XMM1 RDI i_0x1 i_0x0 '
- 'VHADDPSrr XMM3 XMM2 XMM2'
- 'ADD64ri8 RDI RDI i_0x10'
config: ''
register_initial_values:
- 'XMM1=0x42'
cpu_name: skylake-avx512
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: SKXPort0, value: 0.4612, per_snippet_value: 1.3836 }
- { key: SKXPort1, value: 0.4224, per_snippet_value: 1.2672 }
- { key: SKXPort2, value: 0.169, per_snippet_value: 0.507 }
- { key: SKXPort3, value: 0.1682, per_snippet_value: 0.5046 }
- { key: SKXPort4, value: 0.0017, per_snippet_value: 0.0051 }
- { key: SKXPort5, value: 0.6674, per_snippet_value: 2.0022 }
- { key: SKXPort6, value: 0.3366, per_snippet_value: 1.0098 }
- { key: SKXPort7, value: 0.001, per_snippet_value: 0.003 }
- { key: NumMicroOps, value: 1.6761, per_snippet_value: 5.0283 }
error: ''
info: ''
assembled_snippet: 4883EC10C7042442000000C744240400000000C744240800000000C744240C0000000062F17E086F0C244883C410C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C3
...
So far so good. Now I want to try analyzing more practical code.