Yaozarrs Guide to OME-Zarr#

What you'll learn

This guide attempts to demystify the OME-Zarr (OME-Zarr) specification and shows you how to work with it using yaozarrs. It is designed to answer common questions and confusions encountered in the community.

Quicklinks#

I have images

Any data with 5 or less dimensions, typically [T][C][Z]YX.

Go to Images
I have plate data

Multi-well plates and high-content screening (HCS) experiments

Go to Plates
I have image annotations

Segmentation masks, annotation labels, and regions of interest (ROIs)

Go to Labels
I have multiple images

Collections of related images (multi-FOV, stage positions, split files)

Go to Collections

What is OME-Zarr?#

The official OME-Zarr specification can be found at https://ngff.openmicroscopy.org/. In case of any discrepancies between this guide and the official spec, the official spec takes precedence!

OME-Zarr is a file format specification used by the bioimaging community for storing multi-dimensional data. It is a "meta-specification", based on the pre-existing Zarr format, which is designed for the storage of chunked, compressed, N-dimensional arrays. OME-Zarr extends Zarr by adding metadata conventions specific to bioimaging, making it easier to store and share complex imaging datasets.

But what is it?

To resolve a somewhat common confusion...

OME-Zarr is "just" Zarr (A file format used in many domains). The "OME" part is a specification on top of the zarr format that additionally defines:

How domain specific metadata should be stored.
The details are version-specific, but this generally defines the exact form of the data inside of the .zattrs or zarr.json files that accompany the zarr groups.
How datasets are organized.
Beyond metadata, the OME-Zarr specification also defines how datasets should be organized. For example: it defines how the images collected across a multi-well plate experiment should be organized in a single Zarr directory, or how the different resolutions of a multi-scale (pyramidal) image should be stored.

Working with Images#

An Image is the fundamental building block of OME-Zarr.

As of v0.5, a single image may have no less than 2 and no more than 5 dimensions, and may store multiple resolution levels.

Spatial dimensions: X, Y, optionally Z
Time: T (temporal axis)
Channels: C (fluorescence channels, RGB, etc.)

What if I have more than 5 dimensions?

While it is common to have datasets with more than 5 dimensions (e.g., different stage positions in a shared coordinate space, angles in light sheet microscopy, etc.), there is currently no formal specification for more than 5 dimensions in OME-Zarr. You may use the transitional bioformats2raw.layout to store multiple images in a single zarr group. See Working with Collections

See also: an RFC ("request for comments") proposing a relaxation of this restriction: RFC-3

What if I have both RGB and optical channels?

As of v0.5, there is no formal specification for mixing the concepts of RGB image components and conventional "channels" (like optical configurations). You will need to either create a custom group layout or flatten them all into a single channel dimension.

Directory Structure#

OME-Zarr v0.5 (Zarr v3)OME-Zarr v0.4 (Zarr v2)

image.zarr/
├── zarr.json            # {"zarr_format": 3} group, with attributes.ome.multiscales
├── 0/                   # Full resolution array  
│   ├── zarr.json        # Array metadata (standard zarr schema)
│   └── c/0/1/2/3        # Chunk files
├── 1/                   # downsampled level 1
│   └── ...
└── 2/                   # downsampled level 2 
    └── ...

image.zarr/
├── .zgroup              # {"zarr_format": 2} group
├── .zattrs              # Contains "multiscales"
├── 0/                   # Full resolution array
│   ├── .zarray          # Array metadata (standard zarr schema)
│   └── t/c/z/y/x        # Chunk files with "/" separator
├── 1/                   # downsampled level 1
│   └── ...
└── 2/                   # downsampled level 2 
    └── ...

Key difference

Most of the structural changes between v0.4 and v0.5 relate to the transition from Zarr v2 to Zarr v3.

<=v0.4: "multiscales" metadata directly in root of .zattrs files
>=v0.5: "multiscales" metadata in zarr.json under attributes.ome namespace

Axes#

Axes define the dimensions of your image data. As of v0.4, axes are objects with name, and optional type and/or unit:

Axis Constraints

Constraints for image axes in OME-Zarr are the same in v0.4 and v0.5:

MUST have 2-5 dimensions total
MUST have 2-3 spatial axes
MAY have 0-1 time axis
MAY have 0-1 channel axis
Ordering enforced: time → channel/custom → space

In practice, this limits valid axis combinations to: [T][C][Z] Y X
(though no explicit restriction is placed on naming conventions)

v0.4v0.5

Spec JSON:

{
  // found in a "multiscales" object
  "axes": [
    {"name": "c", "type": "channel"},
    {"name": "z", "type": "space", "unit": "micrometer"},
    {"name": "y", "type": "space", "unit": "micrometer"},
    {"name": "x", "type": "space", "unit": "micrometer"}
  ]
}

yaozarrs Code:

from yaozarrs import v04

axes = [
    v04.ChannelAxis(name="c"),
    v04.SpaceAxis(name="z", unit="micrometer"),
    v04.SpaceAxis(name="y", unit="micrometer"),
    v04.SpaceAxis(name="x", unit="micrometer"),
]

Breaking change from v0.3

In v0.3, axes were simple strings: ["c", "z", "y", "x"]. In v0.4+, they must be objects with explicit types.

Spec JSON:

{
  // found in a "multiscales" object
  "axes": [
    {"name": "c", "type": "channel"},
    {"name": "z", "type": "space", "unit": "micrometer"},
    {"name": "y", "type": "space", "unit": "micrometer"},
    {"name": "x", "type": "space", "unit": "micrometer"}
  ]
}

yaozarrs Code:

from yaozarrs import v05

axes = [
    v05.ChannelAxis(name="c"),
    v05.SpaceAxis(name="z", unit="micrometer"),
    v05.SpaceAxis(name="y", unit="micrometer"),
    v05.SpaceAxis(name="x", unit="micrometer"),
]

Coordinate Transformations#

Starting in v0.4, every dataset MUST include coordinate transformations that map data coordinates to physical coordinates. Coordinate transforms are where you would specify physical units (micrometers, seconds), multi-resolution scales, as well as stage positions and spatial offsets for registration.

v0.4v0.5

Scale Transformation (REQUIRED):

Maps array indices to physical coordinates. Scale values represent the physical size per pixel for each dimension.

Spec JSON:

{
  "datasets": [{
    "path": "0",
    "coordinateTransformations": [
      {"type": "scale", "scale": [1.0, 0.5, 0.1, 0.1]}
    ]
  }]
}

yaozarrs Code:

from yaozarrs import v04

dataset = v04.Dataset(
    path="0",
    coordinateTransformations=[
        v04.ScaleTransformation(scale=[1.0, 0.5, 0.1, 0.1])
    ]
)

Translation Transformation (OPTIONAL):

Adds a spatial offset. Must come after scale.

Spec JSON:

{
  "coordinateTransformations": [
    {"type": "scale", "scale": [1.0, 0.5, 0.1, 0.1]},
    {"type": "translation", "translation": [0.0, 0.0, 100.0, 200.0]}
  ]
}

yaozarrs Code:

dataset = v04.Dataset(
    path="0",
    coordinateTransformations=[
        v04.ScaleTransformation(scale=[1.0, 0.5, 0.1, 0.1]),
        v04.TranslationTransformation(translation=[0.0, 0.0, 100.0, 200.0])
    ]
)

Transformation Rules

MUST have exactly one scale transformation per dataset
MAY have at most one translation transformation
If translation exists, it MUST come after scale
Transformation length MUST match number of axes

Identical to v0.4, just stored under attributes.ome namespace.

yaozarrs Code (same as v0.4):

from yaozarrs import v05

dataset = v05.Dataset(
    path="0",
    coordinateTransformations=[
        v05.ScaleTransformation(scale=[1.0, 0.5, 0.1, 0.1]),
        v05.TranslationTransformation(translation=[0.0, 0.0, 100.0, 200.0])
    ]
)

v0.5 Additional Requirement

In v0.5, each array's zarr.json MUST include dimension_names matching the axes:

{
  "dimension_names": ["c", "z", "y", "x"]
}

Interactive Example#

Modify the parameters below to see how different image configurations are represented in OME-Zarr:

Labels (Segmentation Masks)#

Labels are specialized images with integer dtype representing segmentation masks (nuclei, cells, regions of interest, etc.).

They are represented as a special group named "labels/" within an image group.

Careful

This is one of the only places in the specification where the name of the group itself is normative: it MUST be named labels/.

Conforming readers must search for the labels/ group within an image group to discover it: its presence is not indicated in the parent image metadata.

Label Structure and Code

OME-Zarr v0.5 (Zarr v3)OME-Zarr v0.4 (Zarr v2)

Directory Structure:

image.zarr/
├── zarr.json            # Image metadata ("attributes.ome.multiscales")
├── 0/                   # Full resolution image
├── 1/                   # Downsampled level 1
├── ...                  # Downsampled level 2
└── labels/
    ├── zarr.json        # Labels group metadata ("attributes.ome.labels")
    ├── nuclei/          # Label image (integer dtype)
    │   ├── zarr.json    # Label metadata ("attributes.ome.multiscales", "attributes.ome.image_label")
    │   ├── 0/           # Full resolution labels
    │   └── 1/           # Downsampled labels
    └── cells/
        └── ...

Labels Group Metadata (labels/zarr.json):

{
  "zarr_format": 3,
  "node_type": "group",
  "attributes": {
    "ome": {
      "labels": ["nuclei", "cells"]
    }
  }
}

Label Image Metadata (labels/nuclei/zarr.json):

{
  "zarr_format": 3,
  "node_type": "group",
  "attributes": {
    "ome": {
      "multiscales": [...],
      "image_label": {
        "version": "0.5",
        "colors": [
          {"label_value": 1, "rgba": [255, 0, 0, 255]},
          {"label_value": 2, "rgba": [0, 255, 0, 255]}
        ],
        "source": {
          "image": "../../"
        }
      }
    }
  }
}

yaozarrs Code:

from yaozarrs import v05

# Label metadata stored at labels/nuclei/zarr.json
label_image = v05.LabelImage(
    multiscales=[...],  # Same structure as regular image
    image_label=v05.ImageLabel(
        colors=[
            v05.LabelColor(label_value=1, rgba=[255, 0, 0, 255]),
            v05.LabelColor(label_value=2, rgba=[0, 255, 0, 255])
        ],
        source=v05.LabelSource(image="../../")
    )
)

Directory Structure:

image.zarr/
├── .zgroup
├── .zattrs              # Image metadata ("multiscales")
├── 0/                   # Full resolution image
├── 1/                   # Downsampled level 1
├── ...                  # Downsampled level 2
└── labels/
    ├── .zgroup
    ├── .zattrs          # Labels group metadata ("labels")
    ├── nuclei/          # Label image (integer dtype)
    │   ├── .zgroup
    │   ├── .zattrs      # Label metadata ("multiscales", "image-label")
    │   ├── 0/           # Full resolution labels
    │   └── 1/           # Downsampled labels
    └── cells/
        └── ...

Labels Group Metadata (labels/.zattrs):

{
  "labels": ["nuclei", "cells"]
}

Label Image Metadata (labels/nuclei/.zattrs):

{
  "multiscales": [...],
  "image-label": {
    "version": "0.4",
    "colors": [
      {"label-value": 1, "rgba": [255, 0, 0, 255]},
      {"label-value": 2, "rgba": [0, 255, 0, 255]}
    ],
    "source": {
      "image": "../../"
    }
  }
}

yaozarrs Code:

from yaozarrs import v04

# Label metadata stored at labels/nuclei/.zattrs
label_image = v04.LabelImage(
    multiscales=[...],  # Same structure as regular image
    image_label=v04.ImageLabel(
        colors=[
            v04.LabelColor(label_value=1, rgba=[255, 0, 0, 255]),
            v04.LabelColor(label_value=2, rgba=[0, 255, 0, 255])
        ],
        source=v04.LabelSource(image="../../")
    )
)

Labels must use integer dtype

Validation will fail if label arrays use float dtypes. Use uint8, uint16, uint32, or int32.

Working with Plates#

A Plate represents multi-well plate data from high-content screening (HCS) experiments. The hierarchy is:

Plate → Rows/Columns → Wells → Fields of View (Images)

Each well can contain multiple fields of view (FOVs) across multiple acquisitions (timepoints).

Directory Structure#

OME-Zarr v0.5 (Zarr v3)OME-Zarr v0.4 (Zarr v2)

plate.zarr/
├── zarr.json              # contains Plate metadata ("attributes.ome.plate")
├── A/                     # Row A
│   ├── 1/                 # Well A1
│   │   ├── zarr.json      # contains Well metadata ("attributes.ome.well")
│   │   ├── 0/             # Field 0 (Standard multiscales image)
│   │   │   ├── zarr.json  # contains Image metadata ("attributes.ome.multiscales")
│   │   │   ├── 0/         # Full resolution
│   │   │   ├── 1/         # Downsampled
│   │   │   └── labels/    # Optional labels group (see above)
│   │   └── 1/             # Field 1
│   └── 2/                 # Well A2
└── B/                     # Row B
    └── 1/

plate.zarr/
├── .zgroup
├── .zattrs              # contains Plate metadata ("plate")
├── A/                   # Row A
│   ├── 1/               # Well A1
│   │   ├── .zgroup
│   │   ├── .zattrs      # contains Well metadata ("well")
│   │   ├── 0/           # Field 0 (Standard multiscales image)
│   │   │   ├── .zgroup
│   │   │   ├── .zattrs  # contains Image metadata ("multiscales")
│   │   │   ├── 0/       # Full resolution
│   │   │   ├── 1/       # Downsampled
│   │   │   └── labels/  # Optional labels group (see above)
│   │   └── 1/           # Field 1
│   └── 2/               # Well A2
└── B/                   # Row B
    └── 1/

Three-level hierarchy

Three groups MUST exist above images: plate → row → well

Plate Metadata#

v0.4v0.5

Spec JSON (.zattrs at plate root):

{
  "plate": {
    "version": "0.4",
    "name": "HCS Experiment",
    "columns": [
      {"name": "1"},
      {"name": "2"},
      {"name": "3"}
    ],
    "rows": [
      {"name": "A"},
      {"name": "B"}
    ],
    "wells": [
      {"path": "A/1", "rowIndex": 0, "columnIndex": 0},
      {"path": "A/2", "rowIndex": 0, "columnIndex": 1},
      {"path": "B/1", "rowIndex": 1, "columnIndex": 0}
    ],
    "acquisitions": [
      {"id": 0, "name": "Initial", "maximumfieldcount": 4},
      {"id": 1, "name": "24h", "maximumfieldcount": 4}
    ],
  }
}

yaozarrs Code:

from yaozarrs import v04

plate_def = v04.PlateDef(
    name="HCS Experiment",
    columns=[
        v04.Column(name="1"),
        v04.Column(name="2"),
        v04.Column(name="3")
    ],
    rows=[
        v04.Row(name="A"),
        v04.Row(name="B")
    ],
    wells=[
        v04.PlateWell(path="A/1", rowIndex=0, columnIndex=0),
        v04.PlateWell(path="A/2", rowIndex=0, columnIndex=1),
        v04.PlateWell(path="B/1", rowIndex=1, columnIndex=0),
    ],
    acquisitions=[
        v04.Acquisition(id=0, name="Initial", maximumfieldcount=4),
        v04.Acquisition(id=1, name="24h", maximumfieldcount=4),
    ]
)

plate = v04.Plate(plate=plate_def)

Breaking change from v0.3

In v0.4, rowIndex and columnIndex became required for all wells. This enables efficient sparse plate handling without path parsing.

Same structure as v0.4, stored under attributes.ome in zarr.json:

Spec JSON (zarr.json at plate root):

{
  "attributes": {
    "ome": {
      "plate": {
        "columns": [
          { "name": "1" },
          { "name": "2" },
          { "name": "3" }
        ],
        "rows": [
          { "name": "A" },
          { "name": "B" }
        ],
        "wells": [
          { "path": "A/1", "rowIndex": 0, "columnIndex": 0 },
          { "path": "A/2", "rowIndex": 0, "columnIndex": 1 },
          { "path": "B/1", "rowIndex": 1, "columnIndex": 0 }
        ],
        "acquisitions": [
          { "id": 0, "maximumfieldcount": 4, "name": "Initial" },
          { "id": 1, "maximumfieldcount": 4, "name": "24h" }
        ],
        "field_count": 4,
        "name": "HCS Experiment"
      }
    }
  }
}

yaozarrs Code:

from yaozarrs import v05

plate_def = v05.PlateDef(
    name="HCS Experiment",
    columns=[  # must have at least 1 column
        v05.Column(name="1"),
        v05.Column(name="2"),
        v05.Column(name="3")
    ],
    rows=[  # must have at least 1 row
        v05.Row(name="A"),
        v05.Row(name="B")
    ],
    wells=[  # must have at least 1 well, paths match tree structure
        v05.PlateWell(path="A/1", rowIndex=0, columnIndex=0),
        v05.PlateWell(path="A/2", rowIndex=0, columnIndex=1),
        v05.PlateWell(path="B/1", rowIndex=1, columnIndex=0),
    ],
    acquisitions=[  # optional 
        v05.Acquisition(id=0, name="Initial", maximumfieldcount=4),
        v05.Acquisition(id=1, name="24h", maximumfieldcount=4),
    ],
    field_count=4  # max FOV per well
)

plate = v05.Plate(plate=plate_def)

# Create full zarr.json
zarr_json = v05.OMEZarrGroupJSON(attributes={"ome": plate})
json_str = zarr_json.model_dump_json(indent=2, exclude_unset=True)

Well Metadata#

Wells list the fields of view (images) they contain:

Spec JSON (.zattrs in well directory):

{
  "well": {
    "version": "0.4",
    "images": [
      {"path": "0", "acquisition": 0},
      {"path": "1", "acquisition": 0},
      {"path": "2", "acquisition": 1}
    ]
  }
}

yaozarrs Code:

from yaozarrs import v04

well_def = v04.WellDef(
    images=[
        v04.FieldOfView(path="0", acquisition=0),
        v04.FieldOfView(path="1", acquisition=0),
        v04.FieldOfView(path="2", acquisition=1),
    ]
)

well = v04.Well(well=well_def)

Field	Requirement	Description
`images`	MUST	List of field of view objects
`images[].path`	MUST	Path to image group
`images[].acquisition`	MUST (if multiple acquisitions exist)	Links to plate acquisition ID

Interactive Example#

Modify the parameters below to see how different image configurations are represented in OME-Zarr:

Working with Collections#

OME-Zarr does not currently have an official specification for collections of images.

By Collections of images, we mean groups of related images, usually sharing a coordinate space, that do not fit into the plate model. Examples include:

Multiple stage positions on single coverslip
Multiple angles in light sheet microscopy
Tomographic tilt series
Jagged or otherwise irregular sets of related images that don't fit the multiscales model

Status

There is a long-standing github issue that discusses potential future standards for collections, and a (currently pending) pull request for RFC-8, which covers this topic. But as of v0.5 and January 2026, there is no official spec.

The "bioformats2raw" layout is a transitional solution, internally employed by the bioformats2raw tool when dumping multiple series (commonly found in image formats supported by bioformats) into a single zarr hierarchy.

This bioformats2raw layout described in the NGFF spec, is described below:

Directory Structure#

v0.4v0.5

series.ome.zarr               # One converted fileset from bioformats2raw
    ├── .zgroup
    ├── .zattrs               # Contains "bioformats2raw.layout" metadata
    ├── OME                   # Special group for containing OME metadata
    │   ├── .zgroup
    │   ├── .zattrs           # Contains "series" metadata
    │   └── METADATA.ome.xml  # OME-XML file stored within the Zarr fileset
    ├── 0                     # First image in the collection
    ├── 1                     # Second image in the collection
    └── ...

series.ome.zarr               # One converted fileset from bioformats2raw
    ├── zarr.json             # Contains "bioformats2raw.layout" metadata
    ├── OME                   # Special group for containing OME metadata
    │   ├── zarr.json         # Contains "series" metadata
    │   └── METADATA.ome.xml  # OME-XML file stored within the Zarr fileset
    ├── 0                     # First image in the collection
    ├── 1                     # Second image in the collection
    └── ...

Metadata#

v0.4v0.5

Spec JSON (.zattrs at root):

{
  "bioformats2raw.layout": 3
}

yaozarrs Code:

from yaozarrs import v04

# Root .zattrs
bf2raw = v04.Bf2Raw()  # layout defaults to 3

# OME/.zattrs
series = v04.Series(series=["0", "1", "2", "3"])

yaozarrs Code:

from yaozarrs import v05

# Root zarr.json
root_zarr_json = v05.OMEZarrGroupJSON(
  attributes={"ome": v05.Bf2Raw()}
)

# OME/zarr.json
ome_zarr_json = v05.OMEZarrGroupJSON(
    attributes={"ome": v05.Series(series=["0", "1", "2", "3"])}
)

Image Location Rules

If plate metadata exists → use plate structure
If series attribute exists in OME/.zattrs → paths must match OME-XML Image element order
Otherwise → consecutively numbered groups: 0/, 1/, 2/...

When to Use Collections vs. Plates#

Scenario	Use Collection	Use Plate
Multiple FOVs on coverslip
Irregular stage positions
Time-lapse split across files
Multi-well HCS experiment
Regular grid with well labels

Rule of thumb

If your data has rows and columns (like A1, B2, etc.), use a Plate. If it's just multiple related images, use a Collection.

Reference#

Version Comparison Matrix#

Feature	v0.2	v0.3	v0.4	v0.5
Zarr version	v2	v2	v2	v3
Axes format	Implicit TCZYX	Strings	Objects	Objects
Axis type field	N/A	N/A	SHOULD	SHOULD
Axis unit field	N/A	N/A	SHOULD	SHOULD
Coordinate transforms	N/A	N/A	MUST	MUST
Metadata location	`.zattrs`	`.zattrs`	`.zattrs`	`zarr.json`
OME namespace	N/A	N/A	N/A	`attributes.ome`
`dimension_names`	N/A	N/A	N/A	MUST
Plate indices	Optional	Optional	MUST	MUST

Breaking Changes Quick Reference#

Migration	Key Breaking Change	Impact
v0.2 → v0.3	Axes must be explicit strings	Moderate - add `axes` field
v0.3 → v0.4	Axes become objects + coordinate transforms required	Major - restructure metadata
v0.4 → v0.5	Zarr v3 file structure + OME namespace	Critical - completely different storage

Additional Resources#

OME-Zarr Specification - Official specification
yaozarrs API Documentation - Complete API reference
Zarr Format Specification - Zarr v2 and v3 specs
OME Data Model - Full OME-XML specification
GitHub Repository - Source code and issues

You're ready!

You now understand the OME-Zarr specification and how to work with it using yaozarrs. Happy imaging!