-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate bytes based on ser_json_bytes #1308
base: main
Are you sure you want to change the base?
Conversation
please review |
CC @jcharum |
Codecov ReportAttention: Patch coverage is
📢 Thoughts on this report? Let us know! |
I'm also interested in accepting both standard and URL-safe base64 encoding. If this PR is acceptable, I plan to send a separate one for that: d8e-ai/pydantic-core@validate-base64...d8e-ai:pydantic-core:validate-base64-any |
4c44225
to
f8addc1
Compare
CodSpeed Performance ReportMerging #1308 will not alter performanceComparing Summary
|
Thanks for the PR! I fully support the use case and think this makes sense. That said, I worry about silently breaking user code by changing the meaning of the existing option in this way. How about adding a new flag, e.g. Or we could add |
Either option works for me. I don't currently know of a use case where someone would want base64 encoding only and not decoding, so I'd lean towards the new bidirectional encoding flag. But maybe that use case exists, or maybe there's a pattern in other Pydantic config to follow (that I'm not familiar with)? I'm happy to make changes for whichever you recommend! (And please feel free to make changes yourself, too, if you prefer.) |
Having discussed with @sydney-runkle and @samuelcolvin, I think we would prefer to go with a new I agree that the bidirectional encoding flag is probably what most people actually need, so if you strongly want that I'd probably accept it; with the individual flags it should be an easy layer on top which just sets both of them (and we can error if it is set as well as either individual flag). It's a shame to have the complexity of both forms, but hey, it's where we've ended up. It's possible we could consider deprecating the individual flags in V3. |
6e7fc01
to
f7ca32f
Compare
f7ca32f
to
1e71ab7
Compare
Sounds good, I've switched to a new I'm guessing there's some more work to do with the new config key (docs, Pydantic's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good! I'd like to see a couple of changes for consistency with the rest of the library...
As for updating the Python code in pydantic
, yes that's a good idea. I'd suggest opening a PR already, you can test it all locally and then add pytest.xfail
markers in the PR which we can then remove when this support gets released in pydantic-core
.
tests/test_json.py
Outdated
with pytest.raises(ValueError): | ||
v.validate_json('"wrong!"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably need to expect a ValidationError
here, where the input was bytes but the wrong format. It might mean adding a new error type e.g. bytes_format
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. (I think; let me know if I did the right thing with the error type.)
src/validators/config.rs
Outdated
Ok(bytes) => Ok(EitherBytes::from(bytes)), | ||
Err(err) => Err(PyValueError::new_err(format!("Base64 decode error: {err}"))), | ||
}, | ||
BytesMode::Hex => Err(PyValueError::new_err("Hex deserialization is not supported")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be very desirable to add hex
support at the same time as adding this flag. The hex
crate is a common standard in the Rust ecosystem and we could use it trivially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I've created pydantic/pydantic#9770. I'm guessing it needs to wait until after the pydantic-core upgrade, otherwise a user may see this new option and be confused about why it doesn't work? |
I've created pydantic/pydantic#9772 to address the test-pydantic-integration failure. |
Well, I think I'm stuck 😅. pydantic/pydantic#9772 has test failures saying the new documented error type isn't in pydantic-core, and this PR has a CI failure saying the new error type isn't documented there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I forgot to publish these comments earlier.)
tests/test_json.py
Outdated
with pytest.raises(ValueError): | ||
v.validate_json('"wrong!"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. (I think; let me know if I did the right thing with the error type.)
src/validators/config.rs
Outdated
Ok(bytes) => Ok(EitherBytes::from(bytes)), | ||
Err(err) => Err(PyValueError::new_err(format!("Base64 decode error: {err}"))), | ||
}, | ||
BytesMode::Hex => Err(PyValueError::new_err("Hex deserialization is not supported")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Change Summary
ser_json_bytes
transforms values (with base64 encoding) during serialization. But validation doesn't do a complementary base64 decode, so a serialization round-trip into the same model type yields an unequal object.Related issue number
None for this directly. Other users have mentioned base64 decoding, though: pydantic/pydantic#7000 (comment)
Checklist
pydantic-core
(except for expected changes)Selected Reviewer: @davidhewitt