-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement pydantic models as dataclasses #210
Implement pydantic models as dataclasses #210
Conversation
This removes our pydantic dependency by reimplementing them with dataclasses. There are a few breaking changes: 1. The classes won't automatically cast the argumetns to the declared type. IMO, that's the preferable behavior. Some backwards compatability shims have been added for np.dtype, but perhpas we want to remove that too. 2. The models won't have any of the methods they previously inherited from pydantic.BaseModel. This is probably good for user-facing objects, we now have full control over the public API. 3. We had to reorder some of the fields on ZArray, since dataclasses is stricter about positional arguments. I've aligned the order with `zarr.create`. The tests did need to be updated in a few places (we compare some strings, and the order changed, and we don't automatically cast lists to tuples in the init for shape and size).
virtualizarr/zarr.py
Outdated
order: Optional[Literal["C"] | Literal["F"]] = None, | ||
shape: Optional[tuple[int, ...]] = None, | ||
zarr_format: Optional[Literal[2] | Literal[3]] = None, | ||
**kwargs: Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want the list of fields that can be replaced here in the signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As opposed to letting users call dataclasses.replace? I think ZArray
shouldn't have fields that aren't replaceable, are there any fields like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if dataclasses.replace
exists already then explicitly writing the list of fields isn't a good enough reason to override the default method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no wait dataclasses.replace
is a module-level function, not a method on the dataclass type. In that case I change my mind - we should list out the arguments and call dataclasses.replace
from within the ZArray.replace
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ZArray shouldn't have fields that aren't replaceable, are there any fields like that?
Just to be clear, the behavior here (and on main
) is to return a new instance of ZArray
with the same values as the old instance, but new values specified in the keyword arguments.
we should list out the arguments
Sounds good. I think we'll be able to do that (we'll need to remember to keep the two signatures in step though).
This will cause a conflict with #206, so waiting on that to go first. |
@@ -94,7 +94,7 @@ def test_accessor_to_kerchunk_dict(self): | |||
"refs": { | |||
".zgroup": '{"zarr_format":2}', | |||
".zattrs": "{}", | |||
"a/.zarray": '{"chunks":[2,3],"compressor":null,"dtype":"<i8","fill_value":null,"filters":null,"order":"C","shape":[2,3],"zarr_format":2}', | |||
"a/.zarray": '{"shape":[2,3],"chunks":[2,3],"dtype":"<i8","fill_value":null,"order":"C","compressor":null,"filters":null,"zarr_format":2}', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So order of fields in the kerchunk references matters now? Did it matter before? This should be fine but it's also the sort of thing that kerchunk might not be consistent about...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it still didn't matter before and still doesn't, as far as using Kerhcunk goes :) This reason this matters for the test is because we're doing string comparison on the "a/.zarray"
field. We could also parse that JSON into a dict before doing the comparision, and I wouldn't have needed to update the expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great @TomAugspurger !
34d493a
to
ea15b8e
Compare
Planning to merge this tonight. |
This removes our pydantic dependency by reimplementing them with dataclasses. There are a few breaking changes:
zarr.create
.The tests did need to be updated in a few places (we compare some strings, and the order changed, and we don't automatically cast lists to tuples in the init for shape and size).
__post_init__
vs pydantic@model_validator(mode="after")
#204docs/releases.rst