Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of .obs features in bivariate statistics #152

Open
peterpdu opened this issue Nov 22, 2024 · 3 comments
Open

Use of .obs features in bivariate statistics #152

peterpdu opened this issue Nov 22, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@peterpdu
Copy link

Hi,

Is there any plan to add the option to use features in .obs for li.mt.bivariate? For example, if I calculate two gene set scores and want to check their spatial relationship?

@peterpdu peterpdu added the enhancement New feature or request label Nov 22, 2024
@dbdimitrov
Copy link
Collaborator

Hi @peterpdu,

Apologies for the delayed response. You could transfer categorical variables or else from .obs to adata.X, but for that you would need to create a new AnnData object. From then on, applying the bivariate scores would be as usual.

Essentially, in relatively realistic pseudocode what you need to do is

import pandas as pd
from anndata import AnnData

# define new X
#  your variables of interest where, rows are the same shape as adata.n_obs, and columns can be as many as you need
# say you have columns of interest in adata.obs
X = adata.obs[['column1', 'column2', 'column3']].values
var = pd.DataFrame(X.columns)
obs = adata.obs # I guess you still might need this e.g. for visualizations
obsp = adata.obsp # this is where the spatial connectivities are stored
uns = adata.uns # you need this for the image data typically
obsm = adata.obsm

adata2 = AnnData(X=X, var=var, obs=obs, obsp=obsp, uns=uns, obsm=obsm)

# then, you just need to provide a list of tupples for your interactions of interest:
li.mt.bivariate(... # as usual
interactions=[('column1', 'column2),
                       ('column2', 'column3'), ...]
)


@dbdimitrov
Copy link
Collaborator

hope this help!

@dbdimitrov
Copy link
Collaborator

PS. For categorical variables (e.g. cell types), you would need to do dummy (one-hot) encoding (e.g. via pd.get_dummies), and then you would need to use a bivariate score which weights before doing correlations such as Moran's R or the products.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants