Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cred): Use in built library instead of forced cred #1340

Closed
wants to merge 2 commits into from

Conversation

dark0dave
Copy link

@dark0dave dark0dave commented Nov 27, 2023

Closes #941

@dark0dave dark0dave changed the title fix(cred): Use inbuilt library instead of forced cred fix(cred): Use in built library instead of forced cred Jan 19, 2024
@dark0dave dark0dave force-pushed the master branch 3 times, most recently from addab9f to 34f4c52 Compare January 19, 2024 21:07
@dark0dave dark0dave force-pushed the master branch 2 times, most recently from 7bd0625 to c0ff345 Compare January 22, 2024 16:44
Copy link
Owner

@jschneier jschneier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, have a couple questions to start; am interested in getting this in for sure.

storages/backends/gcloud.py Outdated Show resolved Hide resolved
storages/backends/gcloud.py Outdated Show resolved Hide resolved
@jschneier
Copy link
Owner

Please also update the documentation once the design is finalized

@dark0dave dark0dave force-pushed the master branch 2 times, most recently from 6a86f86 to d5e8ee1 Compare February 19, 2024 19:22
@dark0dave
Copy link
Author

Please also update the documentation once the design is finalized

Can do.

@dark0dave dark0dave force-pushed the master branch 2 times, most recently from 13655bb to fd6702d Compare February 19, 2024 19:35
@dark0dave
Copy link
Author

@jschneier thank you for taking the time to review this it is much appreciated. We are using my fork in prod and I would prefer to use the mainline package. Is there anything further needed.

Thanks again.

@dark0dave dark0dave requested a review from jschneier February 19, 2024 19:41
@dark0dave dark0dave force-pushed the master branch 3 times, most recently from 4d9daf3 to e8be761 Compare February 19, 2024 20:00
if self.project_id is None:
self.project_id = project_id
credentials.refresh(requests.Request())
if not hasattr(credentials, "service_account_email"):
Copy link
Owner

@jschneier jschneier Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What determines if the hasattr check fails? Also, if it fails is it an error not to set sa_email?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some situations (workload identity) if you try to use the default service account the service account email can be missing. In this situation its best to provide the sa_email so you can sign urls.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you are saying that this must be set in some situation? Can you add that to the documentation.

@dark0dave dark0dave force-pushed the master branch 3 times, most recently from 793debd to 3d2a913 Compare February 20, 2024 09:45
@dark0dave dark0dave requested a review from jschneier February 20, 2024 09:47
@dark0dave
Copy link
Author

@jschneier I have updated

@@ -54,7 +54,7 @@ dropbox = [
"dropbox>=7.2.1; python_version<'3.12'",
]
google = [
"google-cloud-storage>=1.27",
"google-cloud-storage>=2.14",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to why we need to bump this requirement?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Buts generally good to bump the library.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s good to bump your application requirements but as a library I’m trying to maintain as much flexibility for users as possible. If we need a new feature then bumping is worthwhile.

A major version bump in particular could induce work and compat issues to users.

storages/backends/gcloud.py Outdated Show resolved Hide resolved
Comment on lines -322 to +337
default_params = {
"bucket_bound_hostname": self.custom_endpoint,
params = {
"service_account_email": self.credentials.service_account_email,
"access_token": self.credentials.token,
"credentials": self.credentials,
"expiration": self.expiration,
"version": "v4",
}
params = parameters or {}

for key, value in default_params.items():
if value and key not in params:
params[key] = value

if self.custom_endpoint:
params["api_access_endpoint"] = self.custom_endpoint
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This set of changes reverts using the parameters bit of url(). Sorry if there is no test currently covering this.

if self.project_id is None:
self.project_id = project_id
credentials.refresh(requests.Request())
if not hasattr(credentials, "service_account_email"):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you are saying that this must be set in some situation? Can you add that to the documentation.

@dark0dave dark0dave force-pushed the master branch 3 times, most recently from db8f377 to e016da1 Compare February 22, 2024 16:46
@jschneier
Copy link
Owner

Any news on this? I am also trying to implement this. The current state of the PR looks sound. Is there anything missing that needs to be added/reviewed? Would love to help with any new asks if any. Currently our code is in MVP stage and we have a class that inherits from GoogleCloudStorage with the logic in this PR

Yeah I actually have been working on this one a bit myself but could very much use the help, especially for testing with Google Cloud. There are a few things:

  1. As mentioned in my comment this change breaks the parameters support for url
  2. We shouldn't bump the library required version for no reason
  3. The parameters passed in url() were changed, I'd like confirmation that this changes NOTHING about backwards compat
  4. I don't feel we have enough documentation about e.g when SA_EMAIL must be set, it'd be good to get some comprehensive testing using Google Cloud
  5. We should change the variable name to GA_SA_EMAIL to be consistent with sa_email

@yohannes15
Copy link

yohannes15 commented Jul 6, 2024

@jschneier what is the protocol here when someone has made PR already and I want to add some of the finishing touches. Do I fork dark0dave:master and PR to that branch after adding my changes. Or should I fork django-storages:master and add this logic and rest of implementation there as a new separate PR?

Btw prev comment was from my work account. Going to be deleting that comment and communicating using this.

@yohannes15
Copy link

yohannes15 commented Jul 6, 2024

1. As mentioned in my comment this change breaks the `parameters` support for `url`

2. We shouldn't bump the library required version for no reason

3. The parameters passed in `url()` were changed, I'd like confirmation that this changes NOTHING about backwards compat

4. I don't feel we have enough documentation about e.g when SA_EMAIL must be set, it'd be good to get some comprehensive testing using Google Cloud

5. We should change the variable name to `GA_SA_EMAIL` to be consistent with `sa_email`
  1. Agree the parameters support shouldn't be removed. I still see the value in it even with the current state of the PR. There are kwargs that aren't in the GoogleCloudStorage settings that are needed such as content_type/api_access_endpoint... I think I have a solution that is backwards compatible with that.
  2. Agree
  3. Agree
  4. According to the generate_signed_url functions docs and tests, my thinking is that the GS_SA_EMAIL (sa_email) will be what service_account_email would be if not explicility provided by the env (a basic example of this is running a development server locally using UserCredentials instead of ServiceAccount/ImpersonatedServiceAccount credentails. In that case the Credentails doesn't have a service_account_email attr. Looks like the recommended way to check if it is available is like in the code here, call the metadata server/refresh creds if not TokenState.FRESH. Check if current creds has a service_account_email attr (meaning a service account is being used for authorization). There might be other cases and that might be the thing we want to research. I think the GS_SA_EMAIL is basically a substitute for that to impersonate a service account in order to BlobSign. without that a service account key is explicitly needed to be passed
    Here is the source code i'm referring to:
  1. Agree

@yohannes15
Copy link

yohannes15 commented Jul 6, 2024

googleapis/google-auth-library-python#50

This issue which has been open for ages is really the concern I think here that is making current url() logic ugly. There are different complaints about why google doesn't automatically infer credentials when signing urls. googleapis/google-auth-library-python#50 (comment)

"bucket_bound_hostname": self.custom_endpoint,
params = {
"service_account_email": self.credentials.service_account_email,
"access_token": self.credentials.token,
Copy link

@yohannes15 yohannes15 Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently self.credentials throws a attribute error here as well since self.credentials hasn't been changed. the credentials that were set in the client @Property doesn't flow over here. Might have to either set the self.credentials in there, or use self._client._credentials. 1st one seems ugly in setting a instance variable in an other property, 2nd option uses private attribute from Client.. There is probably a 3rd option but that might involve some additional clean up

@jschneier
Copy link
Owner

@jschneier what is the protocol here when someone has made PR already and I want to add some of the finishing touches. Do I fork dark0dave:master and PR to that branch after adding my changes. Or should I fork django-storages:master and add this logic and rest of implementation there as a new separate PR?

Generally a new PR. As the maintainer I can push to it but generally only do that for minor things.

For (1) above, I added tests in #1408 so that when this is rebased, they will break.

For the rest I am happy to accept the findings, just mentioning what I think are the issues with the current version & why it hasn't merged. I trust you to make sure it's all buttoned up in the end!

@jschneier
Copy link
Owner

Superseded by #1427.

@jschneier jschneier closed this Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

signed urls do not work in Cloud Run with django-storages
3 participants