Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editions language feature: (pb.go).struct_tag #1619

Closed
bmon opened this issue Jun 3, 2024 · 11 comments
Closed

Editions language feature: (pb.go).struct_tag #1619

bmon opened this issue Jun 3, 2024 · 11 comments

Comments

@bmon
Copy link

bmon commented Jun 3, 2024

This request relates to the previous discussion on #52

Some of the software I maintain would significantly benefit from being able to set custom struct tags for some proto message types when compiled to go. Similar to the many voices in #52, we dream of being able to do this with protoc-gen-go. I noted one of the problems this feature posed was that the proto field meta would only be meaningful only for generated go code.

Lately I found out about protobuf editions and saw that there is explicit support for language-specific features. For example the string_type option for c++. https://protobuf.dev/editions/features/#string_type

I was wondering if the advent of this support for language specific features could possibly allow for a reconsideration of support for setting golang struct tags in protobuf files?

Describe the solution you'd like
Perhaps something like

edition = "2023";

import "google/protobuf/go_features.proto";

message Foo {
  string bar = 6;
  string baz = 7 [features.(pb.go).tags = 'my_tag_name:"my_tag_value"'];
}
@lfolger
Copy link
Contributor

lfolger commented Jun 11, 2024

Could you explain your use case in more detail?

@bmon
Copy link
Author

bmon commented Jun 13, 2024

Sure. I have two use-cases:

We have a gRPC interceptor which will automatically log request content under certain conditions (request failure, latency, etc). We very much like this approach because it's on by default for all our applications (a developer can't forget logging), and it saves us time. However, it's crucial for us that PII or otherwise sensitive information does not get logged. We use struct-tags added to the generated request messages to flag specific fields for masking or to be omitted entirely from logging.

For example, for a given method and request message:

service UserService {
    rpc ProvisionNewUser(ProvisionUserRequest) returns (User);
}

message ProvisionUserRequest {
    string email = 1 [features.(pb.go).tags = 'log:"mask=email"'];
    string org_id = 2;
    string name = 3 [features.(pb.go).tags = 'log:"mask=all"'];
    string password = 5 [features.(pb.go).tags = 'log:"-"'];
}

May produce a log looking something like:

{"msg":"UNKNOWN error while handling request", "grpc.request.content":{"email":"f**@b**.com", "org_id": "1234-5678", "name": "**********"}}

We have another, use case which is pretty similar, but stems from an internal auditing component. The system will automatically parse and record request data along with other metadata about the request, in order to produce a searchable audit-trail of actions users take while using our system.

Similar to the way I described the logging interceptor, it is on by default, and is intended to be highly automated so it is reliable but also does not add extra development burden. It has similar PII concerns to the example above, but also has additional requirements regarding flagging particular fields as identifiers, so different requests which are relevant to each other can be associated.


For both of the examples I gave above there isn't really a way to achieve the developer experience we want without changing the generated code. Our current best solution is to generate the protos, then have another application manipulate the generated code to add the struct tags.

I think it's worth noting that these are just the use-cases I have - and there are many, many more that others have raised in #52 including simplifing data access layers (db/sqlx/datastore tags), implementing custom request validation, and more.

@lfolger
Copy link
Contributor

lfolger commented Jun 13, 2024

For both of the examples I gave above there isn't really a way to achieve the developer experience we want without changing the generated code.

I don't think is true. You can use custom field/message option for this. This also allows you to annotate your proto once and extract the information in all languages not only Go and you can be sure that the information is consistent across languages. Please see https://protobuf.dev/programming-guides/proto3/#options and let me know if this works for you.

@bmon
Copy link
Author

bmon commented Jun 18, 2024

Thanks for the reply. I had a stab at implementing what you described, but I ran into some issues. We implement a wrapper around a logging implementation to support the masking, and I guess it would be possible to change our logger implementation to both scan for struct tags and proto reflect. However it had the consequences of our logger now having to type assert everything we log in order to attempt to protoreflect, in addition to scanning struct tags.

Unfortunately I can't think of a way to get the solution you described working for external packages that I don't implement, for example sqlx's db tags and cloud datastore's datastore tags.

Do you know if there's some way to workaround these issues? RE: the original request, would it be possible or appropriate for protoc-gen-go to add a language feature like the ability to set struct tags? If the maintainers are at all open to the idea I would be happy to try and implement it.

@lfolger
Copy link
Contributor

lfolger commented Jun 18, 2024

In general we moved away from struct tags in favor of protoreflect API. In the deprecated protobuf module we used struct tags to build limited reflection mechanisms. However, this turned out to be insufficient (complex options cannot be represented in struct tags) which is why we moved to the protoreflect API which we have now.

I think if you have other third-party packages that operate on your data, you should either make sure they understand protocol buffers or use an abstraction that you control (e.g. a struct type which you convert your generated protobuf type from and to). The problem is if you don't have an abstraction and these third-party libraries don't understand protobuf, we would need to add more and more features to protobuf to generated struct that are understood by these third-party tools. Today this might only be struct tags in the future these tools might require additional fields or additional interfaces to implement. This is not sustainable.

I want to emphasis that protocol buffers are a language interoperability tools and language specific features should be used sparingly or they might conflict with features that are added to the protobuf spec in the future.

In other words, for now the maintainers are not open to accept such a feature request.

@bmon
Copy link
Author

bmon commented Jun 21, 2024

In general we moved away from struct tags in favor of protoreflect API.

Thanks for elaborating on this. I was wondering if it would have been possible to implement the reflection via struct tags instead of methods, but I suspected the problems you described may have cropped up.

Today this might only be struct tags in the future these tools might require additional fields or additional interfaces to implement.

I think we could probably argue that struct tags in go are quite stable and cover almost all use-cases. I understand, though, that what you're describing extends beyond go. It would be unfortunate for proto definitions to be filled with varying language specific definitions of the "same" metadata.

At the same time, I'm left with this desire for protoc-gen-go to support more flexible interoperability with the rest of the go ecosystem. I had hoped editions maybe provided a path to providing something along those lines. Thank you for taking the time to highlight the issues that remain.

@lfolger
Copy link
Contributor

lfolger commented Jun 27, 2024

Also I realized that edition features are not designed for this. While they can certainly be (ab)used for this it is not the intention to use them this way.

From the documentation:

Features are designed to provide a mechanism for ratcheting down bad behavior over time, on edition boundaries. While the timeline for actually removing a feature may be years (or decades) in the future, the desired goal of any feature should be eventual removal....

@bmon
Copy link
Author

bmon commented Jun 27, 2024

Unfortunately I can't open your link - it appears to be google internal. The excerpt you posted makes sense though. Would love to be able to read more of that documentation, if it is able to be published.

@lfolger
Copy link
Contributor

lfolger commented Jun 27, 2024

Oops I thought it was the public documentation. Sorry about that. I expect this to reach the public docs eventually. I'll keep the posted snippet for now but removed the link.

@lfolger
Copy link
Contributor

lfolger commented Jul 5, 2024

FYI: the documentation is now available publicly: https://protobuf.dev/editions/implementation/

@aktau
Copy link

aktau commented Jul 9, 2024

Will close this now as it seems to have reached a conclusion. If anything is left to do or say, please re-open.

@aktau aktau closed this as completed Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants