Pulsar Node on SURF Research Cloud

title	author	date	license
How to run a Pulsar in SURF Research Cloud	hexylena	2024-11-29	GPL-2.1

Pulsar Node on SURF Research Cloud

This repository provides the Ansible playbook for a Pulsar component on SURF Research Cloud (SRC), and serves as the primary documentation for using this Catalog Item.

Using the Pulsar Catalog Item

Authentication

Ensure that you have an SSH key set in your SRAM profile
Make a note of your username from that page. It is probably of the format ABBBBBB### where A is your first initial and BBBBB is your last name, and potentially a number at the end.

Creating a Pulsar Node

Note

SSH access is required to reconfigure galaxy. Please make sure you set an SSH key

Log in to SURF Research Cloud
In SRC, you should be in a collaborative organisation with a wallet. If you're not, I'm not sure how to fix that. I'm mostly writing this documentation for my colleagues in my CO. Mostly you can ignore the top half of the screen, only the bottom half is useful or relevant for us.
In the Workspaces Tab on the bottom half of the screen, you'll find a Plus Button at right to add a new workspace
Clicking that will let you choose any of the Catalog Items from SRC. They've got a wide selection but we're only interested in the two Pulsar Catalog Items

Warning

The GPU nodes are expensive. In fact it was the motivating reason for building this catalog item: to enable you to launch a node, run some computations, and shut it down, saving you money.

Creating a "workspace" (VM) from a catalog item (a template) is easy, most of the options are fixed for you, you just need to choose the size of the item. Pick an appropriate size for whatever computations you need to do.
Pick a name, it can be anything, it does not matter. Check the expiration date to ensure it is just enough time for your computation and no more. Click submit when you are happy.

Note

By default an "Expiration date" of around 3 days later is chosen. This is an incredibly useful feature as it saves you from forgetting to destroy a VM. Especially for GPU nodes it can help you ensure that they disappear after your computation is complete.

Once done, the workspace will be created for you. You'll need to wait ~5 minutes usually. Go for a beverage ☕️

Accessing the Pulsar Node

Once the workspace is up, you'll see an Access link:
Click that will show you a Pulsar information page. This page is running on your pulsar node itself, and is restricted to ensure only authorised members can access the contents. It includes some configuration you will need to copy to your Galaxy node in order to make use of the Pulsar node.

Configuring Galaxy

Collect the requirements for accessing the Galaxy machine. You will need:
- your username from the first step
- your SSH key that is associated with your SRAM account

SSH into your Galaxy machine (not pulsar!).

ssh -i path/to/your/sram-ssh-key [email protected]

You will need to sudo su to do anything useful. Do that now.
Galaxy configuration is in /srv/galaxy/ by default.

The configuration is discussed fully in the Pulsar information, but it will be briefly covered here as well. Generally there are a few steps that must be followed:

A runner must be registered
A destination/environment must be added with the pulsar details
Some tools should be redirected to this Pulsar

Here is an example of what those changes look like in your Galaxy node. (FAQ: how to read a diff). In this example our pulsar node was called p20 but that will be different for you.

 runners:
   local:
     load: galaxy.jobs.runners.local:LocalJobRunner
     workers: 4
   condor:
     load: galaxy.jobs.runners.condor:CondorJobRunner
+  pulsar:
+    load: galaxy.jobs.runners.pulsar:PulsarRESTJobRunner
 
 
 execution:
   default: docker_dispatch
   environments:
     local_destination:
       runner: local
 
     # ... probably some more environments here.
 
+    remote_p20:
+       runner: pulsar
+       url: https://p20.src-sensitive-i.src.surf-hosted.nl
+       private_token: ySgfM1rnGIsiVN8XlfkFhTB5kgp7AZm3jDnd
+       dependency_resolution: remote
+       manager: _default_
+       # Uncomment the following to enable interactive tools:
+       docker_enabled: true
+       docker_set_user: null
+       docker_memory: "8G"
+       singularity_enabled: false
+       tmp_dir: true
+       outputs_to_working_directory: false
+       container_resolvers:
+       - type: explicit
+       require_container: True
+       container_monitor_command: /mnt/pulsar/venv/bin/galaxy-container-monitor
+       container_monitor_result: callback
+       container_monitor_get_ip_method: command:echo p20.src-sensitive-i.src.surf-hosted.nl
 
 
 tools:
 - class: local # these special tools that aren't parameterized for remote execution - expression tools, upload, etc
   environment: local_env
 - id: Cut1
   environment: condor_1x1
+- id: interactive_tool_jupyter_notebook
+  environment: remote_p20
+- id: interactive_tool_rstudio
+  environment: remote_p20

While you will simply copy-paste the runner and environment, you will need to identify yourself which tools should go to this Pulsar node. If you have already run the tool that needs to go to the GPU node, you can find the ID from the job information page:

Otherwise, it can be found from the URL of a tool page, or from the dropdown to the left of "Execute" at the top of the tool:

Important

If you are running jobs for a limited period of time, you might consider making this pulsar node the default destination. Remember to use the remote_... name of your pulsar node, based on what you copied. Not remote_p20.

 execution:
-  default: docker_dispatch
+  default: remote_p20
   environments:
     local_destination:
       runner: local

With that, you're done, and for the length of time your node is running, your chosen tools (or everything) will be executed on that Pulsar node with more memory and CPU than the Galaxy host, and maybe a GPU as well!

Adding Custom Tools (Sensitive Imaging CO Specific)

You can edit /srv/galaxy/config/emc-tool-conf.xml to add new tool XMLs (e.g. for Teo's Thrombo tool case)

Remember to restart the Galaxy processes after: systemctl restart galaxy-*

License

GPL-2

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
files		files
group_vars		group_vars
images		images
templates		templates
LICENSE		LICENSE
README.md		README.md
parameters.yml		parameters.yml
playbook.yml		playbook.yml
requirements.yml		requirements.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pulsar Node on SURF Research Cloud

Using the Pulsar Catalog Item

Authentication

Creating a Pulsar Node

Accessing the Pulsar Node

Configuring Galaxy

Adding Custom Tools (Sensitive Imaging CO Specific)

License

About

Languages

License

ErasmusMC-Bioinformatics/src-component-pulsar

Folders and files

Latest commit

History

Repository files navigation

Pulsar Node on SURF Research Cloud

Using the Pulsar Catalog Item

Authentication

Creating a Pulsar Node

Accessing the Pulsar Node

Configuring Galaxy

Adding Custom Tools (Sensitive Imaging CO Specific)

License

About

Resources

License

Stars

Watchers

Forks

Languages