Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing and plotting QC metrics after merging to AnnDataSet: more documentation needed #300

Open
jeremymsimon opened this issue Apr 26, 2024 · 1 comment
Labels
documentation 📖 Improvements or additions to documentation

Comments

@jeremymsimon
Copy link

Hi @kaizhang -
This is mainly a request for more documentation, as I didn't see this explicitly mentioned anywhere (though perhaps I missed it) but is likely useful to others.

I've followed your various vignettes for processing and integrating multiple objects, eventually resulting in an AnnDataSet object. I discovered that the QC metrics calculated at the beginning of the pipeline (e.g. n_fragment, frac_dup, etc) are indeed retained after merging AnnData objects into the AnnDataSet like so:

>>> dataset.adatas
Stacked AnnData objects:
    obs: 'n_fragment', 'frac_dup', 'frac_mito', 'tsse', 'doublet_probability', 'doublet_score'
    obsm: 'fragment_paired'

>>> dataset.adatas.obs['n_fragment']
shape: (67_815,)
Series: 'n_fragment' [u64]
[
	9487
	19803
	9662
	11500
	3783
	…
	11122
	5495
	5248
	7645
	3948
]

I can use this to plot a UMAP colored by these variables with:

snap.pl.umap(dataset, color=dataset.adatas.obs['n_fragment'], interactive=False)
snap.pl.umap(dataset, color=dataset.adatas.obs['tsse'], interactive=False)
# etc

However it's worth noting I can't directly specify color = 'n_fragment' otherwise I get a RuntimeError: not found: n_fragment error, since they are not in the dataset.obs itself

Similarly though, I would also like to plot a violin plot, grouped by cluster, showing these variables. It seems though that scanpy.pl.violin doesn't accept an AnnDataSet as input, so I'm wondering whether this is possible without converting the object back to AnnData format? Or perhaps you have plans of implementing a violin plot function of your own?

>>> sc.pl.violin(dataset, dataset.adatas.obs['tsse'], groupby='leiden_final')
AttributeError: 'builtins.AnnDataSet' object has no attribute '_sanitize'

Thanks!

@kaizhang kaizhang added the documentation 📖 Improvements or additions to documentation label May 7, 2024
@kaizhang
Copy link
Owner

kaizhang commented May 7, 2024

Thanks for reporting. We'll add more documentation regarding this shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation 📖 Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants