Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding the barcodes command #215

Open
Thapeachydude opened this issue Dec 18, 2023 · 1 comment
Open

Understanding the barcodes command #215

Thapeachydude opened this issue Dec 18, 2023 · 1 comment

Comments

@Thapeachydude
Copy link

Hi,

I'm trying to demultiplex a series of pools (9-11 donors) with known genotypes based on WES and WGS data. Some of the pools work really well, while others struggle a bit. An issue that I see, the doublet rate in some of the pools is relatively high (quite obvious cell blobs in the middle of a UMAP), this seems to "confuse" souporcell during the clustering, as some SNP clusters are only assigned to these blobs.

Since these are quite obvious doublets, I would try and remove them before running souporcell.
Hence, is there a way to "subset" the number of barcodes souporcell runs on? As, does the barcodes.tsv file provided already subset the number of barcodes processed or would I have to manually subset the bam file before providing it to souporcell.

Cheers and many thanks,
M

@wheaton5
Copy link
Owner

Yeah, just make a new barcodes.tsv and it will only run on those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants