Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement custom storage for orgs #2093

Open
wants to merge 64 commits into
base: main
Choose a base branch
from
Open

Implement custom storage for orgs #2093

wants to merge 64 commits into from

Conversation

tw4l
Copy link
Member

@tw4l tw4l commented Sep 30, 2024

Fixes #578

Adds

  • API endpoints for adding and deleting custom storages on organizations
  • API endpoints for updating primary and/or replica storage for an org
  • API endpoint to check on progress of background job (currently, only bucket copy jobs are supported)
  • Automated hooks to copy an organization's files from previous s3 bucket to new and update files in database when primary storage is changed
  • Automated hooks to replicate content from primary storage to new replica location and update files in the database when a replica location is set on an org
  • New pylint disable comments on many of the backup modules so that linting passes
  • Admin documentation for adding, removing, and configuring custom storage locations on an organization

Notes

Currently, no delete operations happen for a a bucket previously used as a primary or replica location that is unset. Files are copied to the new bucket to ensure there are no usability issues moving forward in the app, but the files are not automatically deleted from the source after the copy job. We could add that but I wonder if it's safer, especially in the early days of testing, to perform that cleanup manually as desired.

Once we're comfortable, we can change the rclone command in the copy_job.yaml background job template from copy to move if we want it to automatically clean up files from the source location on completion. Since the same template is used for copying files from an old primary storage to a new primary storage as well as to replicate from primary storage to a new replica location, we'd want to make sure the latter still uses copy so as not to delete files from the primary storage location.

TODO

  • Documentation
  • Look into progress indicator for copy jobs

@tw4l tw4l force-pushed the issue-578-custom-storage branch from df5b6e9 to 10ab1b6 Compare October 1, 2024 15:24
@tw4l tw4l force-pushed the issue-578-custom-storage branch 9 times, most recently from f226271 to eb065b6 Compare October 17, 2024 15:40
@tw4l tw4l marked this pull request as ready for review October 17, 2024 16:34
@tw4l tw4l requested a review from ikreymer October 17, 2024 16:34
tw4l added 28 commits December 3, 2024 16:48
Previously, files in a default bucket were prefixed with the oid
but were not in custom storages. This commit removes that
distinction to aid in copying files in buckets, removing the need
for unnecessary filepath manipulation.

The CopyBucketJob now only copies an organization's directory
rather than the entire bucket to prevent accidentally copying
another organization's data.
Creating a bucket in the verification stage for adding custom
storages if it didn't exist was useful for testing but is an
anti-pattern for production, so we remove it here.
@tw4l tw4l force-pushed the issue-578-custom-storage branch from c5e88e3 to a867411 Compare December 3, 2024 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Custom S3 Buckets for Orgs
1 participant