-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update library index synchronously [FC-0062] #35367
fix: update library index synchronously [FC-0062] #35367
Conversation
Discarded components need to be removed from index in sync to make sure that users see the latest updated data.
Thanks for the pull request, @navinkarkera! What's next?Please work through the following steps to get your changes ready for engineering review: 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. 🔘 Let us know that your PR is ready for review:Who will review my changes?This repository is currently maintained by Where can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:
When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
# Currently, this is only required to make sure that removed/discarded components are removed | ||
# from the search index and displayed to user properly. If it becomes a performance bottleneck | ||
# for other update operations other than discard, we can update CONTENT_LIBRARY_UPDATED event | ||
# to include a parameter which can help us decide if the task needs to run sync or async. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, we can keep it async and create some signal to send to the frontend once the index has been fully updated. (or allow the frontend to poll the task status)
I am a little concerned about the performance of this, if it's happening when we publish changes. Discarding changes or renaming the library are rare operations, but publishing will happen frequently and the reindex could make that slow if there are a lot of blocks in the library. Maybe we just need to test it and see though.
I would prefer that we just update the actual blocks in the index that were discarded/published - I don't suppose we have that data anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we prevent the user from seeing/handling the outdated components? If this is the case, we need to block the interface, and it doesn't make much difference if the process is sync or async + signal frontend-wise, right?
This pattern will happen across the library authoring (from adding/removing content to tagging) as we use meilisearch as the source.
I don't see a better way than running these operations sync and working on optimizing the bottlenecks.
Also, this PR is only for one event, and we need to tackle it in some other places. Should we create a task to discuss and implement the solution for this?
We could also add more granularity to the events (as discussed in the other PR) to handle better what should be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this for now.
Should we prevent the user from seeing/handling the outdated components? If this is the case, we need to block the interface, and it doesn't make much difference if the process is sync or async + signal frontend-wise, right?
Yes, you're right.
This pattern will happen across the library authoring (from adding/removing content to tagging) as we use meilisearch as the source.
No; in almost all other cases, we only synchronously update one document when there is a change. We're not re-indexing the whole course or whole library. So it's synchronous but it's extremely fast.
We could also add more granularity to the events (as discussed in the other PR) to handle better what should be done.
I think that's the best long-term solution.
Should we create a task to discuss and implement the solution for this?
Yes please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't make much difference if the process is sync or async + signal frontend-wise, right?
It does matter if we're tying up a django web worker thread for a long period of time - it could slow down the site for other people. Putting it into an async task + signal solves that. But for now I believe this is fast enough that it won't be a problem.
@navinkarkera 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future. |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
1 similar comment
2U Release Notice: This PR has been deployed to the edX production environment. |
Description
Discarded components need to be removed from index in sync to make sure that users see the latest updated data.
Supporting information
Testing instructions
CELERY_ALWAYS_EAGER
toFalse
in cms/envs/devstack.py and try discard option in library without this PR. The search index query invalidation takes place before the index is updated in backend. Now checkout this PR and try again.Deadline
"None" if there's no rush, or provide a specific date or event (and reason) if there is one.
Other information
Private-ref
: https://tasks.opencraft.com/browse/FAL-3797