Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] DataGrip executes refresh command slowly #6797

Open
2 of 4 tasks
SGITLOGIN opened this issue Nov 7, 2024 · 1 comment
Open
2 of 4 tasks

[Bug] DataGrip executes refresh command slowly #6797

SGITLOGIN opened this issue Nov 7, 2024 · 1 comment
Labels
kind:bug This is a clearly a bug priority:major

Comments

@SGITLOGIN
Copy link

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

When DataGrip executes refresh, I see that the following operations have been performed, mainly table list operations and table columns operations are time-consuming. Currently, my test results show that it takes four minutes to complete refresh.

Operations:
Routine list in database
Table list in database
Table columns in database
Keys in database.table

image image image image image

Affects Version(s)

1.10.0

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

spark.master yarn
spark.yarn.queue default
spark.executor.cores 1
spark.driver.memory 3g
spark.executor.memory 3g
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.shuffleTracking.enabled true
spark.dynamicAllocation.minExecutors 1
spark.dynamicAllocation.maxExecutors 10
spark.dynamicAllocation.initialExecutors 1
spark.cleaner.periodicGC.interval 5min

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.
@SGITLOGIN SGITLOGIN added kind:bug This is a clearly a bug priority:major labels Nov 7, 2024
@pan3793
Copy link
Member

pan3793 commented Nov 25, 2024

Looks like DBeaver is smarter in this case, it lazily loads the table list when user expands a database, but DataGrip just fetches the full databases and tables eagerly, this would produce tons of HMS calls when database and table numbers are large, it would be slow even with enabling optimization introduced in #6018, in that case, we can do further optimization by parallelizing the table listing under each database

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug priority:major
Projects
None yet
Development

No branches or pull requests

2 participants