Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): read_lazy for xarray reading + read_elem_as_dask -> read_elem_lazy #1247

Open
wants to merge 330 commits into
base: main
Choose a base branch
from

Conversation

ilan-gold
Copy link
Contributor

@ilan-gold ilan-gold commented Nov 30, 2023

This PR is a lighter weight version of #947 that involves using the original AnnData object as the class to hold obs and var xr.Dataset.

Copy link

codecov bot commented Dec 7, 2023

Codecov Report

Attention: Patch coverage is 90.46455% with 39 lines in your changes missing coverage. Please review.

Project coverage is 84.81%. Comparing base (8e9eb88) to head (9b2d9a3).

Files with missing lines Patch % Lines
src/anndata/experimental/backed/_lazy_arrays.py 87.14% 9 Missing ⚠️
src/anndata/_core/merge.py 91.04% 6 Missing ⚠️
src/anndata/_core/storage.py 37.50% 5 Missing ⚠️
src/anndata/_io/specs/lazy_methods.py 93.54% 4 Missing ⚠️
src/anndata/experimental/backed/_compat.py 86.20% 4 Missing ⚠️
src/anndata/experimental/backed/_io.py 90.47% 4 Missing ⚠️
src/anndata/tests/helpers.py 85.00% 3 Missing ⚠️
src/anndata/_io/specs/registry.py 90.90% 2 Missing ⚠️
src/anndata/_core/aligned_df.py 80.00% 1 Missing ⚠️
src/anndata/_core/index.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1247      +/-   ##
==========================================
- Coverage   86.99%   84.81%   -2.18%     
==========================================
  Files          40       45       +5     
  Lines        6045     6416     +371     
==========================================
+ Hits         5259     5442     +183     
- Misses        786      974     +188     
Files with missing lines Coverage Δ
src/anndata/_core/anndata.py 83.77% <100.00%> (+0.04%) ⬆️
src/anndata/_core/views.py 85.71% <100.00%> (-5.40%) ⬇️
src/anndata/_io/specs/__init__.py 100.00% <ø> (ø)
src/anndata/_io/zarr.py 83.75% <100.00%> (+0.20%) ⬆️
src/anndata/_types.py 85.29% <100.00%> (ø)
src/anndata/experimental/__init__.py 100.00% <100.00%> (ø)
src/anndata/experimental/backed/__init__.py 100.00% <100.00%> (ø)
src/anndata/experimental/backed/_xarray.py 100.00% <100.00%> (ø)
src/anndata/_core/aligned_df.py 96.00% <80.00%> (-1.78%) ⬇️
src/anndata/_core/index.py 93.37% <80.00%> (+0.04%) ⬆️
... and 8 more

... and 4 files with indirect coverage changes

@ilan-gold
Copy link
Contributor Author

ilan-gold commented Sep 30, 2024

Discussion point: creating dummy indices to speed up first-load time. The real index will simply be stored in xxx_names

Comment on lines +730 to +737
if any(
isinstance(el, (sparse.spmatrix, SpArray))
or (
isinstance(el, DaskArray)
and isinstance(el._meta, (sparse.spmatrix, SpArray))
)
for el in els
):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bug @ivirshup or @flying-sheep? Needs its own PR? This was spitting out NaNs before for dask-sparse

@ilan-gold ilan-gold changed the title (feat): xarray with experimental backed reading (feat): read_lazy for xarray reading + read_elem_as_dask -> read_lazy_elem Oct 1, 2024
@ilan-gold ilan-gold changed the title (feat): read_lazy for xarray reading + read_elem_as_dask -> read_lazy_elem (feat): read_lazy for xarray reading + read_elem_as_dask -> read_elem_lazy Oct 1, 2024
@ilan-gold
Copy link
Contributor Author

Discussion point: creating dummy indices to speed up first-load time. The real index will simply be stored in xxx_names

Done, tested, with anndata.concat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants