Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

20-25% performance improvement in MPPI controller using Eigen library for computation. #4621

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

Ayush1285
Copy link
Contributor

@Ayush1285 Ayush1285 commented Aug 14, 2024

Basic Info

Info Please fill out this column
Ticket(s) this addresses #4237 #3351
Primary OS tested on Ubuntu
Robotic platform tested on NA, Optimizer Benchmark
Does this PR contain AI generated software? No

Description of contribution in a few bullet points

  • Replaced xtensor with Eigen library for computation.
  • Performance increased by 20-25% with footprint and critics enabled.
  • Currently WIP

Description of documentation updates required from your changes


Future work that may be required in bullet points

For Maintainers:

  • Check that any new parameters added are updated in docs.nav2.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

Copy link
Contributor

mergify bot commented Aug 14, 2024

This pull request is in conflict. Could you fix it @Ayush1285?

Copy link
Contributor

mergify bot commented Aug 16, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Aug 19, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Aug 19, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pretty fast review, so far from complete on each detail, but a good starting point!

nav2_mppi_controller/CMakeLists.txt Outdated Show resolved Hide resolved
nav2_mppi_controller/CMakeLists.txt Outdated Show resolved Hide resolved
nav2_mppi_controller/CMakeLists.txt Outdated Show resolved Hide resolved
nav2_mppi_controller/CMakeLists.txt Show resolved Hide resolved
nav2_mppi_controller/CMakeLists.txt Outdated Show resolved Hide resolved
nav2_mppi_controller/src/critics/constraint_critic.cpp Outdated Show resolved Hide resolved
nav2_mppi_controller/src/optimizer.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

mergify bot commented Aug 20, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Aug 23, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

@SteveMacenski
Copy link
Member

Let me know here when you want me to take a look again! I'm quite excited for this work - even if for no reason than to move to Eigen + 10% performance boost, since Eigen's release and support is much more known than xtensor's

@Ayush1285
Copy link
Contributor Author

Let me know here when you want me to take a look again! I'm quite excited for this work - even if for no reason than to move to Eigen + 10% performance boost, since Eigen's release and support is much more known than xtensor's

Sure, I'm trying a few optimizations and will push changes once I'm done.

Signed-off-by: Ayush1285 <[email protected]>
Copy link
Contributor

mergify bot commented Sep 2, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

@Ayush1285
Copy link
Contributor Author

Ayush1285 commented Sep 3, 2024

Let me know here when you want me to take a look again! I'm quite excited for this work - even if for no reason than to move to Eigen + 10% performance boost, since Eigen's release and support is much more known than xtensor's

I've completed the migration to Eigen. But we need to make sure that functionality-wise everything is correct or not. I'll run tests and ensure that all of them are passing. Meanwhile, you can take a look at the latest changes.

Copy link
Contributor

mergify bot commented Sep 3, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first look through (didn't analyze in detail the math in the critics, but higher level programming items first), but generally looks good to me with some details to answer!

find_package(ament_cmake REQUIRED)
find_package(xsimd REQUIRED)
find_package(xtensor REQUIRED)
find_package(Eigen3 REQUIRED)
Copy link
Member

@SteveMacenski SteveMacenski Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the various check_cxx_compiler_flag useful for Eigen? Any other ones that would be good for Eigen specifically (those are ones pointed out by xtensor)?

Copy link
Contributor Author

@Ayush1285 Ayush1285 Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-mfma is necessary for fast floating point fused multiply-add operations. Answer by one of the Eigen Core maintainer, He has also mentioned enabling OpenMP for multi-threading. And it seems there is no harm in keeping ISA flags(SSE/AVX).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One interesting find, I tried removing the -fast-math flag and it speeds up the performance by a big margin.
For 2000 * 50 size:
xtensor: 11.6 ms avg.
Eigen with fast-math flag: 8.9 ms
Eigen without fast-math flag: 7.5 ms

Copy link
Member

@SteveMacenski SteveMacenski Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it seems there is no harm in keeping ISA flags(SSE/AVX).

There actually is, ARM based processors have inconsistent support for them or versions, so it makes release difficult. If they don't impact performance positively, we should remove them and removes an entire vector of potential problems (which we're currently running into in #4380 and a few other places in the past). If they're important, lets keep them but I had hoped this would be a good source of removal of problems with Eigen, but alas such is life 😆

I'd be curious about OpenMP's changes in performance! We could make that a build option if it helps and those that want to use it can!

https://stackoverflow.com/questions/56547557/basic-ways-to-speed-up-a-simple-eigen-program https://github.com/owlbarn/eigen/blob/master/README.md there are some remarks here on AVX/fast-math, mfma. Might be worth doing a bit of research on Eigen-specific compiler optimizations, it looks like a rich vein and what was good for xtensor might not be right for Eigen.

Perhaps -O3? We do that with Smac Planner since it helps so much, I hardly want people using it without a high level of optimization

For fast-math, it might be worth testing on a couple of CPUs if you have them to make sure (if not, I can also test on my side, I have a few on my benchtop). What experiment are you running to get that performance change and/or does it include all the critics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they're important, lets keep them but I had hoped this would be a good source of removal of problems with Eigen

On my CPU only -mfma and -O3 impacted performance positively(and removal of fast-math also). There was no impact of AVX and SSE flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fast-math, it might be worth testing on a couple of CPUs if you have them to make sure (if not, I can also test on my side, I have a few on my benchtop). What experiment are you running to get that performance change and/or does it include all the critics?

Currently, I have only one CPU :(. You can try it on your CPUs if possible. I'm running optimizer_benchmark/BM_diffDrive with all these critics loaded: {{"ConstraintCritic"}, {"CostCritic"}, {"PathAlignCritic"}, {"GoalCritic"}, {"GoalAngleCritic"}, {"ObstaclesCritic"},{"PathAngleCritic"}, {"PathFollowCritic"}, {"PreferForwardCritic"}};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There actually is, ARM based processors have inconsistent support for them or versions, so it makes release difficult.

Oh, I wasn't aware of this. Once merge conflicts and build errors are resolved, then we can test on multiple CPUs with different flags.

nav2_mppi_controller/benchmark/optimizer_benchmark.cpp Outdated Show resolved Hide resolved
xt::view(control_sequence_.wz, -1) =
xt::view(control_sequence_.wz, -2);

control_sequence_.vx = utils::rollColumns(control_sequence_.vx, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the roll also handle setting the last element to the second to last element?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'll add that part. I'll have to re-implement the rollColumns method because it is not shifting the values in place. Compiler is moving the values of the temporary matrix created inside this function, and this is causing bad memory handling, I think.

nav2_mppi_controller/src/optimizer.cpp Outdated Show resolved Hide resolved
nav2_mppi_controller/src/optimizer.cpp Outdated Show resolved Hide resolved
vx_last = vx_curr;

float & wz_curr = control_sequence_.wz(i);
wz_curr = std::clamp(wz_curr, wz_last - max_delta_wz, wz_last + max_delta_wz);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth having a util for clamp that implements this instead of putting inline for each location? It is prone to copy+paste errors

nav2_mppi_controller/src/optimizer.cpp Show resolved Hide resolved
Signed-off-by: Ayush1285 <[email protected]>
Copy link
Contributor

mergify bot commented Sep 9, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Sep 10, 2024

This pull request is in conflict. Could you fix it @Ayush1285?

@SteveMacenski
Copy link
Member

SteveMacenski commented Sep 10, 2024

I think you may not have done the last rebase / pull correctly. I'm seeing changes in files that don't relate to this PR. It could be due to the conflict and Git's attempt to show me the diffs it can with that in place, but that would need to be resolved before I can review again unfortunately :(

Its a 3k line PR with 169 files, vast majority are unrelated

Copy link
Contributor

mergify bot commented Sep 10, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Contributor

mergify bot commented Sep 11, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

@Ayush1285
Copy link
Contributor Author

I think you may not have done the last rebase / pull correctly. I'm seeing changes in files that don't relate to this PR. It could be due to the conflict and Git's attempt to show me the diffs it can with that in place, but that would need to be resolved before I can review again unfortunately :(

Its a 3k line PR with 169 files, vast majority are unrelated

I've corrected the bad rebase/merge. Lint errors are also fixed.

Copy link
Contributor

mergify bot commented Sep 11, 2024

@Ayush1285, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Signed-off-by: Ayush1285 <[email protected]>
@SteveMacenski
Copy link
Member

This still appears to have a few test failures - though you're right that linting looks good now 👍

@@ -202,8 +196,8 @@ class Optimizer
* @param state fill state
*/
void integrateStateVelocities(
xt::xtensor<float, 2> & trajectories,
const xt::xtensor<float, 2> & state) const;
Eigen::Array<float, -1, 3> & trajectories,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a typedef for -1 and dynamic so that an onlooker knows this means dynamic size? I.e. Eigen::Dynamic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use Eigen::Dynamic instead of -1, that would be more self-explanatory.

nav2_mppi_controller/test/CMakeLists.txt Outdated Show resolved Hide resolved
nav2_mppi_controller/test/CMakeLists.txt Outdated Show resolved Hide resolved
@SteveMacenski
Copy link
Member

Should I have this another review? I know you're actively working and don't want to pester you mid-dev cycle :-)

@Ayush1285
Copy link
Contributor Author

Should I have this another review? I know you're actively working and don't want to pester you mid-dev cycle :-)

Actually, I am also busy in other deliverables. So, I'm not able to allocate much time. But I'll let you know in 3-4 days maybe, once I've finished fixing the test failures and other review comments.

@SteveMacenski
Copy link
Member

Please request a review or otherwise comment when you want me to take a look again :-)

Signed-off-by: Ayush1285 <[email protected]>
@Ayush1285 Ayush1285 marked this pull request as ready for review October 2, 2024 16:26
@Ayush1285
Copy link
Contributor Author

Please request a review or otherwise comment when you want me to take a look again :-)

Yes, I've requested for the review. All unit tests are passing now, but system tests are still failing.

@SteveMacenski
Copy link
Member

Retriggering the job - that could just be flaky. I do want to see coverage results before I review so I can use that in my review process. So I'll review once that is uploaded on the job success. I don't think there's an action item for you (just waiting on the build).

Copy link
Contributor

mergify bot commented Oct 3, 2024

This pull request is in conflict. Could you fix it @Ayush1285?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants