Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

Support BIOMASS dataset in Zarr Data Store #981

Open
TonioF opened this issue Apr 29, 2021 · 5 comments
Open

Support BIOMASS dataset in Zarr Data Store #981

TonioF opened this issue Apr 29, 2021 · 5 comments
Assignees
Labels
ds in_progress pilot Issues regarding Pilot Data Store
Milestone

Comments

@TonioF
Copy link
Contributor

TonioF commented Apr 29, 2021

The Zarr Data Store contains data from the ODP that has been converted into the zarr format. There is one BIOMASS dataset in the Zarr data store. For the purposes of this issue, we say that a dataset is supported when

  1. it can be opened in cate
  2. it can be opened in cate with a spatial subset
  3. its content can be written to disk
  4. its data can be displayed in cate

The BIOMASS dataset cannot be opened with a spatial subset. The traceback is:

[2021-04-29 08:49:29] Request:
open_dataset(datasetid=ESACCI-BIOMASS-L4-AGB-MERGED-100m-2010-2018-fv2.0.zarr, time_range=('2017-01-01', '2017-01-01'), var_names=['agb', 'agb_se'], region=[123.5265, 60.20374, 123.52827, 60.20552])

Traceback (most recent call last):
File "test_cci_data_support.py", line 327, in test_open_ds
dataset, _ = open_dataset(dataset_id=data_id,
File "/home/users/tfincke/Projects/cate/cate/core/ds.py", line 432, in open_dataset
dataset = select_subset(dataset, **subset_args)
File "/home/users/tfincke/Projects/xcube/xcube/core/select.py", line 37, in select_subset
dataset = select_spatial_subset(dataset, xy_bbox=bbox)
File "/home/users/tfincke/Projects/xcube/xcube/core/select.py", line 85, in select_spatial_subset
geo_coding = geo_coding if geo_coding is not None else GeoCoding.from_dataset(dataset, xy_names=xy_names)
File "/home/users/tfincke/Projects/xcube/xcube/core/geocoding.py", line 132, in from_dataset
return cls.from_xy((x, y), xy_names=(x_name, y_name))
File "/home/users/tfincke/Projects/xcube/xcube/core/geocoding.py", line 169, in from_xy
x, is_lon_normalized = _maybe_normalise_2d_lon(x)
File "/home/users/tfincke/Projects/xcube/xcube/core/geocoding.py", line 462, in _maybe_normalise_2d_lon
if _is_crossing_antimeridian(lon_var):
File "/home/users/tfincke/Projects/xcube/xcube/core/geocoding.py", line 457, in _is_crossing_antimeridian
return abs(lon_var.diff(dim=dim_x)).max() > 180.0 or
File "/home/users/tfincke/miniconda3/envs/xcube/lib/python3.8/site-packages/xarray/core/dataarray.py", line 3107, in diff
ds = self._to_temp_dataset().diff(n=n, dim=dim, label=label)
File "/home/users/tfincke/miniconda3/envs/xcube/lib/python3.8/site-packages/xarray/core/dataset.py", line 5489, in diff
variables[name] = var.isel(**kwargs_end) - var.isel(**kwargs_start)
File "/home/users/tfincke/miniconda3/envs/xcube/lib/python3.8/site-packages/xarray/core/variable.py", line 2301, in func
f(self_data, other_data)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 475. GiB for an array with shape (157500, 404999) and data type float64

@TonioF TonioF added ds pilot Issues regarding Pilot Data Store labels Apr 29, 2021
@TonioF TonioF added this to the 3.0 milestone Apr 29, 2021
@TonioF TonioF changed the title Support BIOMASS dataset in Pilot Data Store Support BIOMASS dataset in Zarr Data Store Apr 29, 2021
@forman forman self-assigned this Apr 29, 2021
@forman
Copy link
Member

forman commented Apr 30, 2021

Should be fixed in cate 3.0 by xcube-dev/xcube#442

@pont-us pont-us closed this as completed May 17, 2021
@TonioF TonioF reopened this Jun 21, 2021
@AliceBalfanz AliceBalfanz assigned TonioF and unassigned forman Jun 28, 2021
@TonioF
Copy link
Contributor Author

TonioF commented Jul 19, 2021

Viewing the dataset will resilt in a DeveloperError: Width must be less than or equal to the maximum texture size (16384). Check maximumTextureSize. This error probably happens due to the massive size of the dataset (157500 * 405000)

@AliceBalfanz
Copy link
Contributor

AliceBalfanz commented Jul 20, 2021

This comment is invalid due to wrong url:

I see different errors:
All three approaches result in the same error message (using zarr, xarray and xcube) with anonymous access:

ClientConnectorError: Cannot connect to host cci-ke-o.s3.jc.rl.ac.uk:80 ssl:default [Connect call failed ('172.17.2.151', 80)]

Or is that cube not publicly accessible yet?

@AliceBalfanz
Copy link
Contributor

When opening BIOMASS with newest xcube, it works fine with open_dataset:

from xcube.core.dsio import open_dataset, open_cube
ds = open_dataset("https://cci-ke-o.s3-ext.jc.rl.ac.uk:8443/esacci/ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED-1978-2020-fv05.3.zarr", s3_kwargs=dict(anon=True))

image

When opening it with open_cube an error occurs:

image

@forman
Copy link
Member

forman commented Mar 28, 2022

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ds in_progress pilot Issues regarding Pilot Data Store
Projects
None yet
Development

No branches or pull requests

4 participants