ZeroDivisionError specific to snapatac2 pipeline or incompatability issue with scvi-tools? #338

yojetsharma · 2024-09-10T05:14:07Z

query= snap.pp.make_gene_matrix(atac, snap.genome.hg38)
query
AnnData object with n_obs × n_vars = 58534 × 60606
    obs: 'sample', 'leiden'
reference=snap.read("GEX.h5ad", backed=None)
AnnData object with n_obs × n_vars = 187285 × 2000
    obs: 'sample', 'cell_type'
    var: 'highly_variable'
query.obs['cell_type']=pd.NA
data = ad.concat(
    [reference, query],
    join='inner',
    label='batch',
    keys=["reference", "query"],
    index_unique='_',
)
data
AnnData object with n_obs × n_vars = 245819 × 1397
    obs: 'sample', 'cell_type', 'batch'
sc.pp.filter_genes(data, min_cells=5)
sc.pp.highly_variable_genes(
    data,
    n_top_genes = 3000,
    flavor="seurat_v3",
    batch_key="batch",
    subset=True
)
scvi.model.SCVI.setup_anndata(data, batch_key="batch")
vae = scvi.model.SCVI(
    data,
    n_layers=2,
    n_latent=30,
    gene_likelihood="nb",
    dispersion="gene-batch",
)
vae.train(max_epochs=1000, early_stopping=True)
INFO: GPU available: True (cuda), used: True
2024-09-10 00:49:45 - INFO - GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
2024-09-10 00:49:45 - INFO - TPU available: False, using: 0 TPU cores
INFO: IPU available: False, using: 0 IPUs
2024-09-10 00:49:45 - INFO - IPU available: False, using: 0 IPUs
INFO: HPU available: False, using: 0 HPUs
2024-09-10 00:49:45 - INFO - HPU available: False, using: 0 HPUs
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
2024-09-10 00:49:45 - INFO - LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
/home/user/miniconda3/envs/scvi-env/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=31` in the `DataLoader` to improve performance.
/home/user/miniconda3/envs/scvi-env/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=31` in the `DataLoader` to improve performance.

Epoch 984/1000:  98%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉  | 984/1000 [1:50:50<01:48,  6.76s/it, v_num=1, train_loss_step=399, train_loss_epoch=428]
Monitored metric elbo_validation did not improve in the last 45 records. Best score: 425.723. Signaling Trainer to stop.

ax = vae.history['elbo_train'][1:].plot()
vae.history['elbo_validation'].plot(ax=ax)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[22], line 2
      1 ax = vae.history['elbo_train'][1:].plot()
----> 2 vae.history['elbo_validation'].plot(ax=ax)

File ~/.local/lib/python3.10/site-packages/pandas/plotting/_core.py:1000, in PlotAccessor.__call__(self, *args, **kwargs)
    997             label_name = label_kw or data.columns
    998             data.columns = label_name
-> 1000 return plot_backend.plot(data, kind=kind, **kwargs)

File ~/.local/lib/python3.10/site-packages/pandas/plotting/_matplotlib/__init__.py:71, in plot(data, kind, **kwargs)
     69         kwargs["ax"] = getattr(ax, "left_ax", ax)
     70 plot_obj = PLOT_CLASSES[kind](data, **kwargs)
---> 71 plot_obj.generate()
     72 plot_obj.draw()
     73 return plot_obj.result

File ~/.local/lib/python3.10/site-packages/pandas/plotting/_matplotlib/core.py:454, in MPLPlot.generate(self)
    452 self._make_plot()
    453 self._add_table()
--> 454 self._make_legend()
    455 self._adorn_subplots()
    457 for ax in self.axes:

File ~/.local/lib/python3.10/site-packages/pandas/plotting/_matplotlib/core.py:792, in MPLPlot._make_legend(self)
    790     title = leg.get_title().get_text()
    791     # Replace leg.LegendHandles because it misses marker info
--> 792     handles = leg.legendHandles
    793     labels = [x.get_text() for x in leg.get_texts()]
    795 if self.legend:

AttributeError: 'Legend' object has no attribute 'legendHandles'

data.obs["celltype_scanvi"] = 'Unknown'
ref_idx = data.obs['batch'] == "reference"
data.obs["celltype_scanvi"][ref_idx] = data.obs['cell_type'][ref_idx]
/tmp/ipykernel_2619671/134013430.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data.obs["celltype_scanvi"][ref_idx] = data.obs['cell_type'][ref_idx]

lvae = scvi.model.SCANVI.from_scvi_model(
    vae,
    adata=data,
    labels_key="celltype_scanvi",
    unlabeled_category="Unknown",
)

lvae = scvi.model.SCANVI.from_scvi_model(
    vae,
    adata=data,
    labels_key="celltype_scanvi",
    unlabeled_category="Unknown",
)
File ~/miniconda3/envs/scvi-env/lib/python3.10/site-packages/scvi/module/_scanvae.py:170, in SCANVAE.__init__(self, n_input, n_batch, n_labels, n_hidden, n_latent, n_layers, n_continuous_cov, n_cats_per_cov, dropout_rate, dispersion, log_variational, gene_likelihood, y_prior, labels_groups, use_labels_groups, linear_classifier, classifier_parameters, use_batch_norm, use_layer_norm, **vae_kwargs)
    147 self.encoder_z2_z1 = Encoder(
    148     n_latent,
    149     n_latent,
   (...)
    156     return_dist=True,
    157 )
    159 self.decoder_z1_z2 = Decoder(
    160     n_latent,
    161     n_latent,
   (...)
    166     use_layer_norm=use_layer_norm_decoder,
    167 )
    169 self.y_prior = torch.nn.Parameter(
--> 170     y_prior if y_prior is not None else (1 / n_labels) * torch.ones(1, n_labels),
    171     requires_grad=False,
    172 )
    173 self.use_labels_groups = use_labels_groups
    174 self.labels_groups = np.array(labels_groups) if labels_groups is not None else None

ZeroDivisionError: division by zero

I had posted this issue on scvi-tools forum and got a response as follows:

Hi, we have never tested transfering to gene activation scores and it doesn’t sound right to me. However, to fix your issue. You are setting all celltypes to None. The way you are then updating it with celltypes is incorrect and it won’t update your matrix (see pandas warning). You should do:

`data.obs.loc[ref_idx, "celltype_scanvi"] = data.obs.loc[ref_idx, 'cell_type']`
[/quote]

The text was updated successfully, but these errors were encountered:

emidalla · 2024-09-10T12:55:51Z

I have the same issue. Using the whole reference (i.e. scRNAseq data coming from four donors) everything worked, then I subset the reference to only use data coming from the donor of the scATAC-seq data and got the 'division by zero' error.

yojetsharma · 2024-09-10T13:12:41Z

It gives this error if you subset it and not otherwise? Because I have been subsetting the data in all the runs. I will try without subsetting and update.

emidalla · 2024-09-10T13:13:53Z

It gives this error if you subset it and not otherwise? Because I have been subsetting the data in all the runs. I will try without subsetting and update.

Exactly, only upon subsetting, i.e. (I think) ending up with some empty cell group

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroDivisionError specific to snapatac2 pipeline or incompatability issue with scvi-tools? #338

ZeroDivisionError specific to snapatac2 pipeline or incompatability issue with scvi-tools? #338

yojetsharma commented Sep 10, 2024 •

edited

Loading

emidalla commented Sep 10, 2024

yojetsharma commented Sep 10, 2024

emidalla commented Sep 10, 2024

ZeroDivisionError specific to snapatac2 pipeline or incompatability issue with scvi-tools? #338

ZeroDivisionError specific to snapatac2 pipeline or incompatability issue with scvi-tools? #338

Comments

yojetsharma commented Sep 10, 2024 • edited Loading

emidalla commented Sep 10, 2024

yojetsharma commented Sep 10, 2024

emidalla commented Sep 10, 2024

yojetsharma commented Sep 10, 2024 •

edited

Loading