added motivation for dropping Zarr v2 support

ome · Feb 21, 2024 · db2ae2f · db2ae2f
1 parent b336847
commit db2ae2f
Showing 1 changed file with 31 additions and 19 deletions.
diff --git a/rfc/2/index.md b/rfc/2/index.md
@@ -63,11 +63,41 @@ Support for other languages is under active development, including C, Java and P
 Libraries will likely prioritize support for v3 over previous versions in the near future.
 OME-Zarr should therefore adopt the new version for future-proofing.
 
+### Sharding
+
+One of the features that become available through the adoption of Zarr v3 is sharding.
+Sharding provides a mechanism where multiple chunks can be stored in a single file/object.
+This can greatly reduce the number of files (i.e. inodes) or objects that are required to store large OME-Zarr images.
+Storing many files/objects can be prohibitive on several storage backends.
+Therefore, sharding (or similar solutions) are a requirement to scale OME-Zarr to peta-scale images.
+
+The sharding mechanism of Zarr v3 is specified in the [sharding codec](https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/v1.0.html).
+
+![Illustration of a sharded array](https://zarr-specs.readthedocs.io/en/latest/_images/sharding.png)
+
+Each shard contains an index that contains references to the inner chunks that are stored within a shard.
+Inner chunks are compressed individually, if such a codec is specified.
+Implementations can read inner chunks individually.
+Depending on the choice of codecs and the underlying storage backends, it may be possible to write inner chunks individually.
+However, in the general case, writing is limited to entire shards.
+
 ## Proposal
 
 This RFC proposes to adopt version 3 of the Zarr format for OME-Zarr.
 Version 2 will no longer be supported.
 
+The motivation for making this hard cut is to reduce the burden of complexity for implementations.
+Currently, many Zarr library implementations support both versions.
+However, in the future they might deprecate support for version 2 or deprioritize it in terms of features and performance.
+Additionally, there are OME-Zarr implementations that have their own integrated Zarr stack.
+With this hard cut, implementations that only support OME-Zarr versions > 0.5 (TODO: update assigned version number) will not need to implement Zarr version 2 as well.
+
+From a OME-Zarr user perspective, the hard cut also makes things simpler: ≤ 0.5 => Zarr version 2 and > 0.5 => Zarr version 3 (TODO: update assigned version number).
+If users wish to upgrade their data from one OME-Zarr version to another, it would be easy to also migrate the core Zarr metadata to version 3.
+This is a fairly cheap operation, because only json files are touched.
+Zarr version 2 and 3 metadata could even live side-by-side in the same hierarchy.
+There are [scripts available](https://github.com/scalableminds/zarrita/blob/8155761/zarrita/array_v2.py#L452-L559) that can migrate the metadata automatically.
+
 ### Notable changes in Zarr v3
 
 There are a few notable changes that Zarr v3 brings for OME-Zarr:
@@ -100,24 +130,6 @@ While the adoption of Zarr v3 does not strictly require changes to the OME-Zarr
 
 Finally, this proposal changes the title of the OME-Zarr specification document to "OME-Zarr specification".
 
-### Sharding
-
-One of the features that become available through the adoption of Zarr v3 is sharding.
-Sharding provides a mechanism where multiple chunks can be stored in a single file/object.
-This can greatly reduce the number of files (i.e. inodes) or objects that are required to store large OME-Zarr images.
-Storing many files/objects can be prohibitive on several storage backends.
-Therefore, sharding (or similar solutions) are a requirement to scale OME-Zarr to peta-scale images.
-
-The sharding mechanism of Zarr v3 is specified in the [sharding codec](https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/v1.0.html).
-
-![Illustration of a sharded array](https://zarr-specs.readthedocs.io/en/latest/_images/sharding.png)
-
-Each shard contains an index that contains references to the inner chunks that are stored within a shard.
-Inner chunks are compressed individually, if such a codec is specified.
-Implementations can read inner chunks individually.
-Depending on the choice of codecs and the underlying storage backends, it may be possible to write inner chunks individually.
-However, in the general case, writing is limited to entire shards.
-
 ## Requirements
 
 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
@@ -200,7 +212,7 @@ It is RECOMMENDED that implementations of OME-Zarr specify the version of the OM
 It is RECOMMENDED that implementations of OME-Zarr that support both v2 and v3-based OME-Zarr versions auto-detect the underlying Zarr version.
 
 While the metadata of Zarr v3 is not backwards compatible, the chunk data is largely backwards compatible, only depending on compressor configuration.
-[There are scripts available](https://github.com/scalableminds/zarrita/blob/async/zarrita/array_v2.py#L452-L559) to migrate Zarr v2 metadata to Zarr v3.
+[There are scripts available](https://github.com/scalableminds/zarrita/blob/8155761/zarrita/array_v2.py#L452-L559) to migrate Zarr v2 metadata to Zarr v3.
 This is generally a light-weight operation.
 Zarr v3 and v2 metadata can exist side-by-side within a Zarr hierarchy.