-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize function layout in hps_accel project #323
base: main
Are you sure you want to change the base?
Conversation
To avoid the possibility of the LoadInput function conflicting with ConvPerChannel4x4 (its caller) in the instruction cache, lay them out sequentially in the final binary. Signed-off-by: Jakub Piecuch <[email protected]>
If we don't supply an explicit TARGET to make, it should be set to the default value in proj.mk, which is digilent_arty. However, this assignment of the default value is currently done _after_ the ifeq in hps_accel/Makefile, so the clock frequency isn't reduced to 75MHz even though it should. Signed-off-by: Jakub Piecuch <[email protected]>
Signed-off-by: Jakub Piecuch <[email protected]>
cf8d6a4
to
45ec30f
Compare
{ | ||
_ftext = .; | ||
*(.text.start) | ||
*/conv_accel.o(.text.*ConvPerChannel4x4*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this file may have mixed spaces and tabs for indents, it would be good to fix that up for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but first let's decide whether this linker script should be here at all.
@@ -0,0 +1,65 @@ | |||
INCLUDE output_format.ld |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add the copyright header to this file? Just grab it from one of the others. "Copyright the CFU-Playground authors" etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I no longer think that providing custom linker scripts for each project is a viable solution, so this file will probably be removed. Please see my latest comments for an alternative approach.
*(.text .stub .text.* .gnu.linkonce.t.*) | ||
_etext = .; | ||
} > main_ram | ||
/*} > rom */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm why are these commented out?
Actually now that I think about it, it looks like this linker script is correct for Arty (where we run the whole CFU-Playground application from RAM) but won't work on HPS boards (where we run the application from flash)? I guess for HPS boards you have to uncomment these lines instead, right?
Is there a way we can have a single linker script that will place things in the right regions in all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This linker script is essentially copy-pasted from common/ld/linker.ld
, including the fragments which are commented out.
You're right, this script isn't suitable for every board. I was not aware of the platform-specific scripts in common/_hps
etc.
I will write down some ideas on how to solve this in a comment outside of this conversation.
@j-piecuch @danc86 I should point out that we use an overlay system for the default linker script for HPS: https://github.com/google/CFU-Playground/tree/main/common/_hps/hps/ld . I guess here, the override LDSCRIPT might need to choose between two or more different scripts depending on the PLATFORM:TARGET. |
@danc86 @tcal-x apologies, I was not aware of the existing overlay system. One simple, but somewhat hacky solution is to use
In the project's Makefile, we could define a make variable that contains the overrides:
We can perform the substitution in the recipe for
A somewhat less hacky way to accomplish the same goal would be to use some macro/templating engine like jinja or the C preprocessor. |
It's good to know that this approach works, but let's hold off on introducing specific hacks until the code is more settled. I think LoadInput() may be deleted soon. |
This PR makes the
LDSCRIPT
make variable overridable from a project, and uses a project-specific linker script inhps_accel
that prevents instruction cache conflicts betweenConvPerChannel4x4()
andLoadInput()
. It also fixes a small bug inhps_accel/Makefile
.