Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include VCF and GZ endings to .gitattributes #231

Open
TCLamnidis opened this issue Dec 18, 2024 · 4 comments
Open

Include VCF and GZ endings to .gitattributes #231

TCLamnidis opened this issue Dec 18, 2024 · 4 comments

Comments

@TCLamnidis
Copy link
Member

TCLamnidis commented Dec 18, 2024

With the recent changes to trident to allow gzipped genotyped files as well as VCF, the git LFS attrobutes needs to be updated to also apply to files ending in gz and vcf.

This also applies to the other archives

@nevrome
Copy link
Member

nevrome commented Dec 18, 2024

Right! For the moment I think we should stick to the established PLINK file format in the community archive, though.

Changing the entire archive to .gz with new package versions would mean duplicating the data at all storage locations, because we keep old versions around.

And also for new packages I would stick to PLINK. Having two different formats in the archive doesn't sound very tidy (or?).

@TCLamnidis
Copy link
Member Author

yes, Im not arguing for changing all packagess rn, just to have the option set up for future.

I think eventually offerring the gzipped bed files is a good idea to lower network traffic and free up storage on the server tho.

@stschiff
Copy link
Member

What do you think about going go gzipped bed- and bio-files from some point onwards, and in case genotype of existing packages gets updated? Or do we want to avoid having mixed formats? I think I would be leaning towards allowing heterogenous formats (after all, that's one of the nice features of trident) and allow gzipped-bed and -bim for new or updated packages, but keep the old unzipped ones.

@nevrome
Copy link
Member

nevrome commented Dec 20, 2024

OK - maybe that is indeed the best use of the new feature. Then we should also update the submission checklist to recommend .gz uploads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants