-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i3544 RV64: Enable memtrace-related tests #6506
Conversation
@abhinav92003 @derekbruening I'm trying to port memtrace-related tests, and with this PR, a simple bin64/drrun -t drcachesim -simulator_type reuse_distance -- echo hello But I have a few questions to ask:
|
Yes, all memory accesses have to be memory operands to comply with the IR conventions which all the tools rely upon. Yes, a read-modify-write instruction has both a source and destination, even if identical.
Does RISC-V typically have longer basic blocks than other ISA's (conditional moves instead of branches maybe)? All our internal uses set -max_bb_instrs to something like 256. Maybe that should be done on Github in the drcachesim launcher. |
No, there are no conditional moves at least in standard RISC-V extensions. The size of the basic blocks is not much different from other arches:
I did some investigation, and the issue seems to come from the context switch of clean calls. For example, if I duplicate the |
The max is hit on other arches with many clean calls in one large block. But I thought that was hundreds of clean calls, to reach the 64K max. Oh I see, it's just 4K on RISCV: #ifdef AARCH64
/* On AArch64, TBNZ/TBZ has a range of +/- 32 KiB. */
enum { MAX_FRAGMENT_SIZE = 0x8000 };
#elif defined(RISCV64)
/* On RISCV64, direct branch has a range of +/- 4 KiB. */
enum { MAX_FRAGMENT_SIZE = 0x1000 };
#else
enum { MAX_FRAGMENT_SIZE = USHRT_MAX };
#endif Hmm. What is the size of a (unoptimized) clean call on RISCV? How many fit in 4K? |
Roughly speaking, there are 32 gprs and 32 fprs, so save + restore is 32 * 2 * 2 * 4 = 512, so 8 clean calls will surely exceed the limit. Ah, we could use |
Some users try to put in a clean call before every app instr, though we discourage that. So that use case would require -max_bb_instrs 15 or sthg. That's not that bad: there's no big downside to a limit like that for single blocks. For traces there is more of a cost; someone putting in that many clean calls should probably just turn off traces anyway as the clean call overhead means any trace speedup is overshadowed -- but a trace limit is harder to set by default. Proposal:
|
Thank you! The proposal look good to me. I’ll close this PR since it seems more suitable to separate the codec and clean call changes into two PRs. |
) Atomic instructions from the A extension like LR/SC and AMO* access memory so they should be treated as load/store. In addition, AMO* instructions perform read-modify-write operations, they should have both a source and a destination operands to comply with DR's IR convention. We make the destination an implicit operand as they are identical to the source. Related PR: #6506 Issue: #3544
tool.reuse_distance test is enabled by performing the following changes: 1. Set -max_bb_instrs to 15 by default on RISC-V 2. Use `c.ldsp/c.sdsp/c.fldsp/c.fsdsp` to reduce the size of context switches for clean calls 3. Swap sc.w/d rd and mem operand positions so that mem becomes the first operand to be consistent with AArch64 4. Implement `drutil_insert_get_mem_addr()` Related PR: #6506 Issue: #3544
No description provided.