Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zarr v3 #206

Closed
wants to merge 51 commits into from
Closed

Zarr v3 #206

wants to merge 51 commits into from

Conversation

normanrz
Copy link
Contributor

@normanrz normanrz commented Jul 20, 2023

This PR proposes to adopt the version 3 of the Zarr format for OME-Zarr.

Main changes:

  • Only Zarr v3 is allowed for upcoming versions of OME-Zarr.
  • Zarr v3 stores its metadata in zarr.json files. This is the same for arrays and groups. Attributes are now stored in a attributes object within the zarr.json files.
  • The chunk_key_encoding of the Zarr arrays are respected instead of mandating the / dimension separator. Basically all Zarr features are available in OME-Zarr as they get approved through the ZEP process.
  • Add a ome namespace within the zarr.json attributes.
  • Move version fields from multiscale, plate, well etc. up into the ome object.
  • Move the registration of labels from the labels group into the multiscale group.
  • Rename the document to OME-Zarr specification

This PR is currently in a draft status and builds upon the #138 PR by @bogovicj.
Check this link for a more convenient review: https://github.com/ome/ngff/pull/206/files/b92f540dc95440f2d6b7012185b09c2b862aa744..HEAD

bogovicj and others added 30 commits February 1, 2022 13:50
* define pixel center coordinate system
* add input/output_axes
* add transform inverse flag
* add affine transform type
* flesh out array space
* define array indexing
* add more transformation types
* start transformation details section and examples
* update example
* use "input" and "output" rather than '*Space' and '*Axes'
* reorder details
* clean up table
* add rotation
* details for sequence
* describe inverses
* wrap examples
* rephrase matrix storage
* change to "coordinates", removing "Field"
* change to "displacements", removing "Field"
* add details for transformation types
* (identity, inverseOf, bijection)
* describe inputAxes and outputAxes
* add new examples
* add mapIndex, mapAXis
* add examples
* affine stored as flat array only
* sequence does not have by-dimension behavior
* flesh out some examples
@github-actions
Copy link
Contributor

Automated Review URLs

@d-v-b
Copy link
Contributor

d-v-b commented Jul 20, 2023

  • Add a ome namespace within the zarr.json attributes.

  • Move version fields from multiscale, plate, well etc. up into the ome object.

Thank you for this!

@d-v-b d-v-b mentioned this pull request Jul 24, 2023
@normanrz normanrz mentioned this pull request Aug 8, 2023
@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/adopt-zarr-v3-in-ome-zarr/84786/1

@joshmoore
Copy link
Member

After a quick read, @normanrz, here are my initial thoughts (where not simply a big 👍):

Add a ome namespace within the zarr.json attributes.
Move version fields from multiscale, plate, well etc. up into the ome object.

These two together make me wonder whether or not the key couldn't be something more sophisticated, like the version itself:

"https://ngff.openmicroscopy.org/0.5": {
   ...
}

It's somewhat non standard to use a more complicated key like that, but it's a value that we will have in memory anyway to load the schema files, etc.

Move the registration of labels from the labels group into the multiscale group.

The labels "object" was definitely poorly conceived originally and so improvements welcome, but I'll note there are a handful of open issues as well as integration with the upcoming SpatialData work that might be worth considering. If it becomes more of an issue, I'd suggest splitting this out of your proposal here (if possible).

Rename the document to OME-Zarr specification

I know this is a constant source of confusion/frustration, but I think we might want to reconsider this.

@normanrz
Copy link
Contributor Author

normanrz commented Oct 4, 2023

Add a ome namespace within the zarr.json attributes.
Move version fields from multiscale, plate, well etc. up into the ome object.

These two together make me wonder whether or not the key couldn't be something more sophisticated, like the version itself:

"https://ngff.openmicroscopy.org/0.5": {
   ...
}

It's somewhat non standard to use a more complicated key like that, but it's a value that we will have in memory anyway to load the schema files, etc.

Fine with me. It is a bit more verbose, but we could get rid of the additional "version" attribute.

Move the registration of labels from the labels group into the multiscale group.

The labels "object" was definitely poorly conceived originally and so improvements welcome, but I'll note there are a handful of open issues as well as integration with the upcoming SpatialData work that might be worth considering. If it becomes more of an issue, I'd suggest splitting this out of your proposal here (if possible).

Also, fine with me to split it out. But I really think label images in OME-Zarr need to be reworked.

Rename the document to OME-Zarr specification

I know this is a constant source of confusion/frustration, but I think we might want to reconsider this.

That is certainly up for debate. But my preference would be to go with OME-Zarr as title of the document, because that is what it is about.

### Array coordinate systems

Every array has a default coordinate system whose parameters need not be explicitly defined. Its name is the path to the array
in the container, its axes have `"type":"array"`, are unitless, and have default "name"s. The ith axis has `"name":"dim_i"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the default names be those specified by the zarr v3 dimension_names property?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is kind of addressed in the paragraph above, but it still might be helpful to clarify here.

}
}
```
</div>

"bioformats2raw.layout" (transitional) {#bf2raw}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this section should be moved to the end since it seems like it may not be of interest to most ome-zarr and is not an ideal starting point for the specification given how specialized it is.

@@ -331,24 +338,1010 @@ Conforming readers:
- MAY ignore other groups or arrays under the root of the hierarchy.


"coordinateSystems" metadata {#coord-sys-md}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would help to clarify exactly where this is specified within the zarr attribute structure. The example could include the top-level "ome" key to clarify. If this isn't intended as a top-level member of the "ome" object, then perhaps it should be described in a later section, after describing its "parent" object, in order to make the specification clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be addressed in #138, which this PR is based on.

@normanrz normanrz mentioned this pull request Feb 15, 2024
@normanrz
Copy link
Contributor Author

Closing this in favor of #227. Discussion will continue there.

@normanrz normanrz closed this Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants