Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bioformats2raw.layout #112

Merged
merged 32 commits into from
Sep 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
d7c0556
First draft of bioformats2raw.layout
joshmoore Apr 7, 2022
8ee02d1
Add layout example
joshmoore Apr 8, 2022
ee15455
Improve wording of "SHOULD parse multiple images", thanks to Ilan
joshmoore Apr 20, 2022
c8e98e4
Clarify contents of METADATA.ome.xml, thanks to Melissa
joshmoore Apr 20, 2022
6403d88
Add schema & test for bf2raw metadata
joshmoore Apr 21, 2022
890e8a2
Add missing schema file
joshmoore Apr 21, 2022
b099281
Merge 'origin/main' into bf2raw
joshmoore May 3, 2022
2b67a36
Add applicable versions statement
joshmoore May 11, 2022
31d34c3
Update text with suggestions
joshmoore May 30, 2022
3173303
Merge 'origin/main' into bf2raw
joshmoore May 30, 2022
d44a066
Add bf2raw examples config
joshmoore May 30, 2022
7574d22
Add test to find missing configs
joshmoore May 30, 2022
ff5d7ec
Split subitems to pass linting
joshmoore May 30, 2022
e2119e5
Add graphical layout representation
joshmoore May 30, 2022
4fcc045
Fix missing whitespace
joshmoore May 30, 2022
57acc23
Add "series" attribute in "OME" group under bioformats2raw.layout
melissalinkert Jul 5, 2022
5eca169
Update the MUST/SHOULD semantics
joshmoore Jul 18, 2022
cf4e04c
Fix doubly indented bullets
joshmoore Sep 14, 2022
89c322d
Make minimum spec a link
joshmoore Sep 14, 2022
3691437
Add no-toc sections
joshmoore Sep 14, 2022
499caee
Re-arrange and add more text
joshmoore Sep 14, 2022
c7582e5
Make changes based on feedback from Will
joshmoore Sep 15, 2022
399b70b
Add bf2raw plate example
joshmoore Sep 15, 2022
740a11b
Add schema for ome series
joshmoore Sep 15, 2022
1f91482
Apply suggestions from code review
joshmoore Sep 19, 2022
861227b
Apply to v0.4 only
joshmoore Sep 22, 2022
8e612ef
Re-iterate plate precedence without OME/METADATA.ome.xml
joshmoore Sep 22, 2022
a0919be
Backport latest/bf2raw to 0.4
joshmoore Sep 22, 2022
5971927
Re-word the 'series' section
joshmoore Sep 22, 2022
88fd042
Make plate/series link a "SHOULD"
joshmoore Sep 22, 2022
7b7c43f
Add 'transitional' to 'omero' spec
joshmoore Sep 26, 2022
106c301
Add 0.4.1 changelog
joshmoore Sep 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions 0.4/examples/bf2raw/.config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"schema": "schemas/bf2raw.schema"
}
3 changes: 3 additions & 0 deletions 0.4/examples/bf2raw/image.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"bioformats2raw.layout" : 3
}
22 changes: 22 additions & 0 deletions 0.4/examples/bf2raw/plate.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"bioformats2raw.layout" : 3,
"plate" : {
"columns" : [ {
"name" : "1"
} ],
"name" : "Plate Name 0",
"wells" : [ {
"path" : "A/1",
"rowIndex" : 0,
"columnIndex" : 0
} ],
"field_count" : 1,
"rows" : [ {
"name" : "A"
} ],
"acquisitions" : [ {
"id" : 0
} ],
"version" : "0.4"
}
}
3 changes: 3 additions & 0 deletions 0.4/examples/ome/.config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"schema": "schemas/ome.schema"
}
3 changes: 3 additions & 0 deletions 0.4/examples/ome/series-2.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"series" : [ "0", "1" ]
}
98 changes: 95 additions & 3 deletions 0.4/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,14 @@ The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL
“RECOMMENDED”, “MAY”, and “OPTIONAL” are to be interpreted as described in
[RFC 2119](https://tools.ietf.org/html/rfc2119).

<p>
<dfn>Transitional</dfn> metadata is added to the specification with the
intention of removing it in the future. Implementations may be expected (MUST) or
encouraged (SHOULD) to support the reading of the data, but writing will usually
be optional (MAY). Examples of transitional metadata include custom additions by
implementations that are later submitted as a formal specification. (See [[#bf2raw]])
</p>

Some of the JSON examples in this document include commments. However, these are only for
clarity purposes and comments MUST NOT be included in JSON objects.

Expand Down Expand Up @@ -240,6 +248,85 @@ keys as specified below for discovering certain types of data, especially images

If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data.

"bioformats2raw.layout" (transitional) {#bf2raw}
------------------------------------------------

[=Transitional=] "bioformats2raw.layout" metadata identifies a group which implicitly describes a series of images.
The need for the collection stems from the common "multi-image file" scenario in microscopy. Parsers like Bio-Formats
define a strict, stable ordering of the images in a single container that can be used to refer to them by other tools.

In order to capture that information within an OME-NGFF dataset, `bioformats2raw` internally introduced a wrapping layer.
The bioformats2raw layout has been added to v0.4 as a transitional specification to specify filesets that already exist
in the wild. An upcoming NGFF specification will replace this layout with explicit metadata.

<h4 id="bf2raw-layout" class="no-toc">Layout</h4>

Typical Zarr layout produced by running `bioformats2raw` on a fileset that contains more than one image (series > 1):

<pre>
series.ome.zarr # One converted fileset from bioformats2raw
├── .zgroup
├── .zattrs # Contains "bioformats2raw.layout" metadata
├── OME # Special group for containing OME metadata
│ ├── .zgroup
│ ├── .zattrs # Contains "series" metadata
│ └── METADATA.ome.xml # OME-XML file stored within the Zarr fileset
├── 0 # First image in the collection
├── 1 # Second image in the collection
└── ...
</pre>

<h4 id="bf2raw-attributes" class="no-toc">Attributes</h4>

The top-level `.zattrs` file must contain the `bioformats2raw.layout` key:
<pre class=include-code>
path: examples/bf2raw/image.json
highlight: json
</pre>

If the top-level group represents a plate, the `bioformats2raw.layout` metadata will be present but
the "plate" key MUST also be present, takes precedence and parsing of such datasets should follow [[#plate-md]]. It is not
possible to mix collections of images with plates at present.

<pre class=include-code>
path: examples/bf2raw/plate.json
highlight: json
</pre>

The `.zattrs` file within the OME group may contain the "series" key:

<pre class=include-code>
path: examples/ome/series-2.json
highlight: json
</pre>

<h4 id="bf2raw-details" class="no-toc">Details</h4>

Conforming groups:

- MUST have the value "3" for the "bioformats2raw.layout" key in their `.zattrs` metadata at the top of the hierarchy;
- SHOULD have OME metadata representing the entire collection of images in a file named "OME/METADATA.ome.xml" which:
- MUST adhere to the OME-XML specification but
- MUST use `<MetadataOnly/>` elements as opposed to `<BinData/>`, `<BinaryOnly/>` or `<TiffData/>`;
- MAY make use of the [minimum specification](https://docs.openmicroscopy.org/ome-model/6.2.2/specifications/minimum.html).

Additionally, the logic for finding the Zarr group for each image follows the following logic:

- If "plate" metadata is present, images MUST be located at the defined location.
- Matching "series" metadata (as described next) SHOULD be provided for tools that are unaware of the "plate" specification.
- If the "OME" Zarr group exists, it:
- MAY contain a "series" attribute. If so:
- "series" MUST be a list of string objects, each of which is a path to an image group.
- The order of the paths MUST match the order of the "Image" elements in "OME/METADATA.ome.xml" if provided.
- If the "series" attribute does not exist and no "plate" is present:
- separate "multiscales" images MUST be stored in consecutively numbered groups starting from 0 (i.e. "0/", "1/", "2/", "3/", ...).
- Every "multiscales" group MUST represent exactly one OME-XML "Image" in the same order as either the series index or the group numbers.

Conforming readers:
- SHOULD make users aware of the presence of more than one image (i.e. SHOULD NOT default to only opening the first image);
- MAY use the "series" attribute in the "OME" group to determine a list of valid groups to display;
- MAY choose to show all images within the collection or offer the user a choice of images, as with <dfn export="true"><abbr title="High-content screening">HCS</abbr></dfn> plates;
- MAY ignore other groups or arrays under the root of the hierarchy.

"coordinateTransformations" metadata {#trafo-md}
-------------------------------------
Expand Down Expand Up @@ -315,10 +402,10 @@ if not datasets:
datasets = [x["path"] for x in multiscales[0]["datasets"]]
```

"omero" metadata {#omero-md}
----------------------------
"omero" metadata (transitional) {#omero-md}
-------------------------------------------

Information specific to the channels of an image and how to render it
[=Transitional=] information specific to the channels of an image and how to render it
can be found under the "omero" key in the group-level metadata:

```json
Expand Down Expand Up @@ -607,6 +694,11 @@ Version History {#history}
<td>Description</td>
</tr>
</thead>
<tr>
<td>0.4.1</td>
<td>2022-09-26</td>
<td>transitional metadata for image collections ("bioformats2raw.layout")</td>
</tr>
<tr>
<td>0.4.0</td>
<td>2022-02-08</td>
Expand Down
14 changes: 14 additions & 0 deletions 0.4/schemas/bf2raw.schema
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://ngff.openmicroscopy.org/latest/schemas/bf2raw.schema",
"title": "NGFF container produced by bioformats2raw",
"description": "JSON from OME-NGFF .zattrs",
"type": "object",
"properties": {
"bioformats2raw.layout": {
"description": "The top-level identifier metadata added by bioformats2raw",
"type": "number",
"enum": [3]
}
}
}
17 changes: 17 additions & 0 deletions 0.4/schemas/ome.schema
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://ngff.openmicroscopy.org/latest/schemas/ome.schema",
"title": "NGFF group produced by bioformats2raw to contain OME metadata",
"description": "JSON from OME-NGFF OME/.zattrs linked to an OME-XML file",
"type": "object",
"properties": {
"series": {
"description": "An array of the same length and the same order as the images defined in the OME-XML",
"type": "array",
"items": {
"type": "string"
},
"minContains": 1
}
}
}
3 changes: 3 additions & 0 deletions latest/examples/bf2raw/.config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"schema": "schemas/bf2raw.schema"
}
3 changes: 3 additions & 0 deletions latest/examples/bf2raw/image.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"bioformats2raw.layout" : 3
}
22 changes: 22 additions & 0 deletions latest/examples/bf2raw/plate.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"bioformats2raw.layout" : 3,
"plate" : {
"columns" : [ {
"name" : "1"
} ],
"name" : "Plate Name 0",
"wells" : [ {
"path" : "A/1",
"rowIndex" : 0,
"columnIndex" : 0
} ],
"field_count" : 1,
"rows" : [ {
"name" : "A"
} ],
"acquisitions" : [ {
"id" : 0
} ],
"version" : "0.4"
}
}
3 changes: 3 additions & 0 deletions latest/examples/ome/.config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"schema": "schemas/ome.schema"
}
3 changes: 3 additions & 0 deletions latest/examples/ome/series-2.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"series" : [ "0", "1" ]
}
101 changes: 97 additions & 4 deletions latest/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,14 @@ The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL
“RECOMMENDED”, “MAY”, and “OPTIONAL” are to be interpreted as described in
[RFC 2119](https://tools.ietf.org/html/rfc2119).

<p>
<dfn>Transitional</dfn> metadata is added to the specification with the
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
intention of removing it in the future. Implementations may be expected (MUST) or
encouraged (SHOULD) to support the reading of the data, but writing will usually
be optional (MAY). Examples of transitional metadata include custom additions by
implementations that are later submitted as a formal specification. (See [[#bf2raw]])
</p>

Some of the JSON examples in this document include commments. However, these are only for
clarity purposes and comments MUST NOT be included in JSON objects.

Expand Down Expand Up @@ -242,9 +250,89 @@ keys as specified below for discovering certain types of data, especially images

If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data.

"bioformats2raw.layout" (transitional) {#bf2raw}
------------------------------------------------

[=Transitional=] "bioformats2raw.layout" metadata identifies a group which implicitly describes a series of images.
The need for the collection stems from the common "multi-image file" scenario in microscopy. Parsers like Bio-Formats
define a strict, stable ordering of the images in a single container that can be used to refer to them by other tools.

In order to capture that information within an OME-NGFF dataset, `bioformats2raw` internally introduced a wrapping layer.
The bioformats2raw layout has been added to v0.4 as a transitional specification to specify filesets that already exist
in the wild. An upcoming NGFF specification will replace this layout with explicit metadata.

<h4 id="bf2raw-layout" class="no-toc">Layout</h4>

Typical Zarr layout produced by running `bioformats2raw` on a fileset that contains more than one image (series > 1):

<pre>
series.ome.zarr # One converted fileset from bioformats2raw
├── .zgroup
├── .zattrs # Contains "bioformats2raw.layout" metadata
├── OME # Special group for containing OME metadata
│ ├── .zgroup
│ ├── .zattrs # Contains "series" metadata
│ └── METADATA.ome.xml # OME-XML file stored within the Zarr fileset
├── 0 # First image in the collection
├── 1 # Second image in the collection
└── ...
</pre>

<h4 id="bf2raw-attributes" class="no-toc">Attributes</h4>

The top-level `.zattrs` file must contain the `bioformats2raw.layout` key:
<pre class=include-code>
path: examples/bf2raw/image.json
highlight: json
</pre>

If the top-level group represents a plate, the `bioformats2raw.layout` metadata will be present but
the "plate" key MUST also be present, takes precedence and parsing of such datasets should follow [[#plate-md]]. It is not
possible to mix collections of images with plates at present.

<pre class=include-code>
path: examples/bf2raw/plate.json
highlight: json
</pre>

The `.zattrs` file within the OME group may contain the "series" key:

<pre class=include-code>
path: examples/ome/series-2.json
highlight: json
</pre>

<h4 id="bf2raw-details" class="no-toc">Details</h4>

Conforming groups:

- MUST have the value "3" for the "bioformats2raw.layout" key in their `.zattrs` metadata at the top of the hierarchy;
- SHOULD have OME metadata representing the entire collection of images in a file named "OME/METADATA.ome.xml" which:
- MUST adhere to the OME-XML specification but
- MUST use `<MetadataOnly/>` elements as opposed to `<BinData/>`, `<BinaryOnly/>` or `<TiffData/>`;
- MAY make use of the [minimum specification](https://docs.openmicroscopy.org/ome-model/6.2.2/specifications/minimum.html).

Additionally, the logic for finding the Zarr group for each image follows the following logic:

- If "plate" metadata is present, images MUST be located at the defined location.
- Matching "series" metadata (as described next) SHOULD be provided for tools that are unaware of the "plate" specification.
- If the "OME" Zarr group exists, it:
- MAY contain a "series" attribute. If so:
- "series" MUST be a list of string objects, each of which is a path to an image group.
- The order of the paths MUST match the order of the "Image" elements in "OME/METADATA.ome.xml" if provided.
- If the "series" attribute does not exist and no "plate" is present:
- separate "multiscales" images MUST be stored in consecutively numbered groups starting from 0 (i.e. "0/", "1/", "2/", "3/", ...).
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
- Every "multiscales" group MUST represent exactly one OME-XML "Image" in the same order as either the series index or the group numbers.

Conforming readers:
- SHOULD make users aware of the presence of more than one image (i.e. SHOULD NOT default to only opening the first image);
- MAY use the "series" attribute in the "OME" group to determine a list of valid groups to display;
- MAY choose to show all images within the collection or offer the user a choice of images, as with <dfn export="true"><abbr title="High-content screening">HCS</abbr></dfn> plates;
- MAY ignore other groups or arrays under the root of the hierarchy.


"coordinateTransformations" metadata {#trafo-md}
-------------------------------------
------------------------------------------------

"coordinateTransformations" describe a series of transformations that map between two coordinate spaces (defined by "axes").
For example, to map a discrete data space of an array to the corresponding physical space.
Expand Down Expand Up @@ -317,10 +405,10 @@ if not datasets:
datasets = [x["path"] for x in multiscales[0]["datasets"]]
```

"omero" metadata {#omero-md}
----------------------------
"omero" metadata (transitional) {#omero-md}
-------------------------------------------

Information specific to the channels of an image and how to render it
[=Transitional=] information specific to the channels of an image and how to render it
can be found under the "omero" key in the group-level metadata:

```json
Expand Down Expand Up @@ -609,6 +697,11 @@ Version History {#history}
<td>Description</td>
</tr>
</thead>
<tr>
<td>0.4.1</td>
<td>2022-09-26</td>
<td>transitional metadata for image collections ("bioformats2raw.layout")</td>
</tr>
<tr>
<td>0.4.0</td>
<td>2022-02-08</td>
Expand Down
14 changes: 14 additions & 0 deletions latest/schemas/bf2raw.schema
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://ngff.openmicroscopy.org/latest/schemas/bf2raw.schema",
"title": "NGFF container produced by bioformats2raw",
"description": "JSON from OME-NGFF .zattrs",
"type": "object",
"properties": {
"bioformats2raw.layout": {
"description": "The top-level identifier metadata added by bioformats2raw",
"type": "number",
"enum": [3]
}
}
}
Loading