NNMF-for-scRNAseq

Non negative matrix factorization for Oligo lineage scRNAseq data

Analysis Pipeline for scRNA-seq Data Using NMF

This repository contains an R script (script.R) for analyzing single-cell RNA sequencing (scRNA-seq) data using Non-negative Matrix Factorization (NMF) with Seurat and RcppML libraries. Below is a step-by-step guide on how to reproduce the analysis:

Steps to Reproduce

Load Required Libraries
- Seurat: library(Seurat)
- dplyr: library(dplyr)
- RcppML: library(RcppML)
Set Seed for Reproducibility
- set.seed(200)
Load Seurat Object
- Replace /data/nasser/Manuscript/processedobject/ODC35_woClus8_subclust3_res0.15_NK with your own path.
- ol <- readRDS("/data/nasser/Manuscript/processedobject/ODC35_woClus8_subclust3_res0.15_NK")
Set Cell Type Identities
- Idents(ol) <- "CellType"
Subset the Data
- Choose specific cell types for analysis (iODC, iOPC, iPPC_0, iPPC_1, iPPC_2).
- Uncomment and modify if subsetting by iCEP is required.
- pd <- subset(ol, idents = c("iODC", "iOPC", "iPPC_0", "iPPC_1", "iPPC_2"))
Optional: Visualize Using DimPlot
- Uncomment DimPlot(pd) to visualize the subset data.
Clean the Cluster (Optional)
- Clean UMAP coordinates to focus on specific areas (umap1 > -2 & umap2 > 1).
Extract Expression Matrix
- Extract the RNA expression matrix from the subsetted Seurat object.
- expression_matrix <- LayerData(object = pd, assay = "RNA", layer = "data")
- Remove rows with NA or null values and those with all zero values.
Set Number of Clusters
- Determine the number of clusters based on unique cell types.
- num_clusters <- length(unique(pd$CellType))
Perform NMF
- Apply NMF to the expression matrix.
- nmf_result <- nmf(expression_matrix, k = num_clusters, tol = 1e-4, maxit = 500)
Extract Basis (W) and Coefficient (H) Matrices
- Retrieve the basis (W) and coefficient (H) matrices from the NMF result.
- W <- nmf_result$w
- H <- nmf_result$h
Identify Most Influential Genes
- Determine top influential genes for each cluster.
- Store results in influential_genes list.
Convert Results to Data Frame
- Convert the list of influential genes to a tidy data frame (influential_genes_df).
Example Visualization
- Generate feature plots for each cluster using influential genes.

Notes

Adjust paths (readRDS) and parameters (subset, nmf) based on your specific data and analysis requirements.
Ensure all necessary libraries (Seurat, dplyr, RcppML) are installed and loaded before running the script.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
nnMF.R		nnMF.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NNMF-for-scRNAseq

Analysis Pipeline for scRNA-seq Data Using NMF

Steps to Reproduce

Notes

About

Releases

Packages

Languages

mode1990/NNMF-for-scRNAseq

Folders and files

Latest commit

History

Repository files navigation

NNMF-for-scRNAseq

Analysis Pipeline for scRNA-seq Data Using NMF

Steps to Reproduce

Notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages