Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update saganird_usage.md #132

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions saganird_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ On NIRD you have access to two different directories:
* /projects/NS9305k: the main storage directory for the Institute. The main directory to know about is the one named datasets.
* /projects/NS9305K/datasets: This is the main raw data directory on NIRD. This should be the master storage for your sequenced reads.

<!--
wgs, trancriptomics and metagenomics folder structure within datasets above is not explained.
--!>

### Saga

Expand All @@ -46,11 +49,11 @@ Saga is our main compute node. Computations should be done using slurm. On Saga

## Working on Saga

Each user should have a directory in the active directory on Saga which is named after your username. This is where each user should do their work. This lets the administrators see how much data each of us is generating. Inside your active you are free to organize your data in any way you please.
Each user should have a directory in the active directory on Saga which is named after your username (/cluster/projects/nn9305k/active/<USERNAME>). This is where each user should do their work. This lets the administrators see how much data each of us is generating. Inside your active (under <USERNAME>) you are free to organize your data in any way you please.

All raw sequencing data (fastq files) should live in the datasets directory. To use the data in analysis, there are two different options. You can either use the absolute path to the data itself, or you can softlink the data into a directory inside your active directory.

Once you copy a dataset into the datasets directory, please update the datasets registration file. Include your username under the list of users. This will let us (and you) to see who is working on that particular dataset.
Once you copy a sequencing data into the datasets directory (/cluster/projects/nn9305k/datasets), please update the datasets registration file. Include your username under the list of users. This will let us (and you) to see who is working on that particular dataset.

Please note: all data in the datasets directory should be a _copy_ of what is in the NIRD datasets directory. In case we run out of space in a critical situation, we might end up having to delete some of the raw datasets. Hence the importance of the Saga data being only a copy.

Expand All @@ -59,5 +62,9 @@ Please note: all data in the datasets directory should be a _copy_ of what is in

NIRD is for more long term storage, while Saga shuld be reserved for data that we are actually currently working on. This means that we will have to transfer data between Saga and NIRD. When you have worked on a project and have finished with it, please pack it up (use gzip and tar) and transfer it to your directory on NIRD.

<!--
Where should the analysed data go in NIRD?
--!>