Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy load click subcommands #3883

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Lazy load click subcommands #3883

wants to merge 20 commits into from

Conversation

ankatiyar
Copy link
Contributor

@ankatiyar ankatiyar commented May 21, 2024

Description

Fix #1476

Development notes

Lazy Loading of subcommands docs on click: https://click.palletsprojects.com/en/8.1.x/complex/#lazily-loading-subcommands

  • Added two lazy groups: kedro/framework/cli/cli.py
    • project_commands: Point to project sub commands such as registry, catalog, package etc
    • global_commands: Point to the implementation of new and starter commands
  • Changed the order of command groups for project_groups and global_groups properties of KedroCLI -> This allows the commands from the plugins to be found first (incase a plugin has overwritten one of the Kedro core commands)
  • Implementation of LazyGroup in kedro/framework/cli/utils.py
  • For our custom CommandCollection class in kedro/framework/cli/utils.py, removed _dedupe_commands() method since the ordering of the command groups means that the commands from plugins are found first. **

** This means that the help text looks a bit different.

Before:
image
After:
Screenshot 2024-07-08 at 16 47 36

KEY DIFFERENCES:

  • Kedro project commands go to the bottom, this is because of the re-ordering of the command groups
  • Command from kedro-override which is a dummy plugin I have installed which overwrites the kedro catalog command groups shows separately in both cases. But in the first one, since de-duplication takes place, catalog help text is removed from under the "Project specific commands from Kedro" section but after this change it remains.

❓ For reviewers: ^ Is the change in the help text acceptable?

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

ankatiyar and others added 14 commits June 24, 2024 11:23
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
@ankatiyar ankatiyar marked this pull request as ready for review July 8, 2024 15:52
@ankatiyar ankatiyar requested a review from merelcht as a code owner July 8, 2024 15:52
@merelcht
Copy link
Member

merelcht commented Jul 9, 2024

Command from kedro-override which is a dummy plugin I have installed which overwrites the kedro catalog command groups shows separately in both cases. But in the first one, since de-duplication takes place, catalog help text is removed from under the "Project specific commands from Kedro" section but after this change it remains.

Is this happening because the de-duplication process would mean you can't do lazy loading? IMO it's not ideal, but it's also not the worst. I think it's becoming clear that performance gains on the CLI loading will always have some downsides. We can always add some clarification in the docs on overwriting of commands and what a user should expect to see on the CLI side.

project_group,
registry_cli,
]
assert len(kedro_cli.project_groups) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to update the test name or leave a comment explaining that the 1 command group loaded is everything that's not doing lazy loading.

@ankatiyar
Copy link
Contributor Author

Is this happening because the de-duplication process would mean you can't do lazy loading? IMO it's not ideal, but it's also not the worst. I think it's becoming clear that performance gains on the CLI loading will always have some downsides. We can always add some clarification in the docs on overwriting of commands and what a user should expect to see on the CLI side.
@merelcht

Yeah, for a LazyGroup, the group.commands is an empty {} so it'll have to be loaded for the de-duplication. It's also going to be tricky if we advocate the plugins to define their own commands as a LazyGroup. The order of the sources of commands will already determine which commands get discovered first, and if commands from plugins are prioritised, this should happen on its own.

We could add information to this part of the help text (up top) saying if there's duplicate commands, the commands from the plugins will be used :
Screenshot 2024-07-09 at 12 40 58

  • To be fair, making our own commands lazy doesn't have too much of a performance gain (atleast on my system). It would be more beneficial if the plugins implemented it, starting with kedro-viz.
  • However, I think if the plugins implemented their own LazyGroup commands and had overwriting Kedro commands in there, probably the de-duplication code we have now won't catch it anyway.

@merelcht
Copy link
Member

merelcht commented Jul 9, 2024

We could add information to this part of the help text (up top) saying if there's duplicate commands, the commands from the plugins will be used : Screenshot 2024-07-09 at 12 40 58

That's a great idea!

  • To be fair, making our own commands lazy doesn't have too much of a performance gain (atleast on my system). It would be more beneficial if the plugins implemented it, starting with kedro-viz.
  • However, I think if the plugins implemented their own LazyGroup commands and had overwriting Kedro commands in there, probably the de-duplication code we have now won't catch it anyway.

Got it! In that case, this solution is all fine with me 👍

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question regarding testing: is it possible to trigger the lazy loading in the tests to make sure the lazy loaded commands to get loaded eventually?

tests/framework/cli/test_cli.py Outdated Show resolved Hide resolved
tests/framework/cli/test_cli_hooks.py Outdated Show resolved Hide resolved
@ankatiyar ankatiyar requested a review from yetudada as a code owner July 9, 2024 14:20
Signed-off-by: Ankita Katiyar <[email protected]>
@ankatiyar
Copy link
Contributor Author

One question regarding testing: is it possible to trigger the lazy loading in the tests to make sure the lazy loaded commands to get loaded eventually?

I added a check if all the commands are being listed properly for the lazy group in 37764a7

```python
if multiprocessing.get_start_method() == "spawn" and package_name:
_bootstrap_subprocess(package_name, logging_config)
```
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The secret scan test was complaining about the link ^ I added the code snippet instead. I hope that's okay? @noklam

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this problem locally as well but it's all good now and the link seems correct and working.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add this to the release notes. Otherwise looks all good to me 👍 🎉

Signed-off-by: Ankita Katiyar <[email protected]>
Copy link
Contributor

@ElenaKhaustova ElenaKhaustova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[spike] Improve Kedro CLI startup time
3 participants