Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor #10

Merged
merged 104 commits into from
May 1, 2024
Merged

Refactor #10

merged 104 commits into from
May 1, 2024

Conversation

BeGeiger
Copy link
Collaborator

@BeGeiger BeGeiger commented May 1, 2024

Refactor Salamander using the AnnData and MuData data structures.

Plan:
- make Salamander compatible with anndata
- reimplement NMF models one at a time

Remove all algorithms except KL-NMF. The goal is to avoid having a broken software along the refactorization.
The plotting module is now implemented around AnnData objects.

Computation and plotting is separated into the tools and plot modules.

One major change is that samples now correspond to rows instead of columns of the mutation count matrix.
KLNMF not refactored yet
Samples now correspond to rows. The variable names are adapted to fit the new plotting module.
Function names now include the names of the input data type if there are multiple versions, e.g., an implementation for numpy arrays and a wrapper around this implementation for AnnData inputs.
n_components is now an explicit argument of the dimensionality reduction.
The motivation was to improve the transparency of the code.
A joint dimensionality reduction of multidimensional observation annotations of multiple anndata objects is useful for models with shared embedding spaces, e.g. (multimodal) correlated NMF.
NMF algorithms are now built around AnnData: both the input data and the signatures are AnnData objects.
The .fit() method is now more abstract and implemented in the root class SignatureNMF.
no longer support
The KL-NMF implementation now only supports the faster joint update rules.
update the KL-NMF tests to the AnnData based model structure
An implementation of minimum-volume NMF based on the SignatureNMF structure with mutation counts and signatures as anndata objects.
mypy complains without explicitly casting the np.argmax return values to integers
The function 'get_obs_order' is necessary for the multimodal exposure plot and should therefore not be private.
The new implementation of multimodal correlated NMF doesn't use the 'add_penalty' parameter
This reverts commit 5081d62.
The scatter and embedding plot now have color and zorder options. This is useful for plotting multiple subgroups of points, e.g., the signature and sample embeddings in correlated NMF.
Add functions to plot .obs and .obsm attributes of multiple AnnData and MuData objects.
Add color and zorder options to the embedding plot.
I also removed the abstraction of the embedding plot in signature_nmf to improve the readability of the code.
Implemented modality-specific updates to improve the indentation and readability.
@BeGeiger BeGeiger merged commit b570218 into main May 1, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant