Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The nesi high performance computers #52

Merged
merged 19 commits into from
Dec 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .markdownlint.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@
"MD013": false,
"MD033": false,
"MD038": false,
"MD046": false
"MD046": false,
"MD041": false
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
created_at: '2022-06-13T04:54:38Z'
hidden: false
position: 6
tags: []
tags: |
gpu
title: Available GPUs on NeSI
vote_count: 2
vote_sum: 2
Expand All @@ -11,36 +12,22 @@ zendesk_section_id: 360000034335
---


NeSI has a range of Graphical Processing Units (GPUs) to accelerate compute-intensive research and support more analysis at scale.
Depending on the type of GPU, you can access them in different ways, such as via batch scheduler (Slurm), interactively (using [Jupyter on
NeSI](../Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md)),
or Virtual Machines (VMs).

[//]: <> (REMOVE ME IF PAGE VALIDATED)
[//]: <> (vvvvvvvvvvvvvvvvvvvv)
!!! warning
This page has been automatically migrated and may contain formatting errors.
[//]: <> (^^^^^^^^^^^^^^^^^^^^)
[//]: <> (REMOVE ME IF PAGE VALIDATED)
The table below outlines the different types of GPUs,
who can access them and how, and whether they are currently available or on the future roadmap.

NeSI has a range of Graphical Processing Units (GPUs) to accelerate
compute-intensive research and support more analysis at scale. Depending
on the type of GPU, you can access them in different ways, such as via
batch scheduler (Slurm), interactively (using [Jupyter on
NeSI](../../Scientific_Computing/Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md)),
or Virtual Machines (VMs). 

The table below outlines the different types of GPUs, who can access
them and how, and whether they are currently available or on the future
roadmap.

If you have any questions about GPUs on NeSI or the status of anything
listed in the table, [contact
Support](https://support.nesi.org.nz/hc/en-gb/requests/new).


If you have any questions about GPUs on NeSI or the status of anything listed in the table, [contact
Support](mailto:[email protected]).

| GPGPU | Purpose | Location | Access mode | Who can access | Status |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------|----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|
| 9 NVIDIA Tesla P100 PCIe 12GB cards (1 node with 1 GPU, 4 nodes with 2 GPUs) |   | [Mahuika](../../Scientific_Computing/The_NeSI_High_Performance_Computers/Mahuika.md) | Slurm and [Jupyter](../../Scientific_Computing/Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md) | NeSI users | Currently available |
| 7 NVIDIA A100 PCIe 40GB cards (4 nodes with 1 GPU, 2 nodes with 2 GPUs) | Machine Learning (ML) applications | [Mahuika](../../Scientific_Computing/The_NeSI_High_Performance_Computers/Mahuika.md) | Slurm | NeSI users | Currently available |
| 7 A100-1g.5gb instances (1 NVIDIA A100 PCIe 40GB card divided into [7 MIG GPU slices](https://www.nvidia.com/en-us/technologies/multi-instance-gpu/) with 5GB memory each) | Development and debugging | [Mahuika](../../Scientific_Computing/The_NeSI_High_Performance_Computers/Mahuika.md) | Slurm and [Jupyter](../../Scientific_Computing/Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md) | NeSI users | Currently available |
| 5 NVIDIA Tesla P100 PCIe 12GB (5 nodes with 1 GPU) | Post-processing | [Māui Ancil](https://support.nesi.org.nz/hc/en-gb/articles/360000203776-M%C4%81ui-Ancillary-Nodes) | Slurm | NeSI users | Currently available |
| 4 NVIDIA HGX A100 (4 GPUs per board with 80GB memory each, 16 A100 GPUs in total) | Large-scale Machine Learning (ML) applications | [Mahuika](../../Scientific_Computing/The_NeSI_High_Performance_Computers/Mahuika.md) | Slurm | NeSI users | Available as part of the [Milan Compute Nodes](https://support.nesi.org.nz/knowledge/articles/6367209795471) |
| 4 NVIDIA A40 with 48GB memory each (2 nodes with 2 GPUs, but capacity for 6 additional GPUs already in place) | Teaching / training | Flexible HPC | [Jupyter](../../Scientific_Computing/Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md), VM, or bare metal tenancy possible (flexible) | Open to conversations with groups who could benefit from these | In development. |
| 9 NVIDIA Tesla P100 PCIe 12GB cards (1 node with 1 GPU, 4 nodes with 2 GPUs) |   | [Mahuika](../The_NeSI_High_Performance_Computers/Mahuika.md) | Slurm and [Jupyter](../Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md) | NeSI users | Currently available |
| 7 NVIDIA A100 PCIe 40GB cards (4 nodes with 1 GPU, 2 nodes with 2 GPUs) | Machine Learning (ML) applications | [Mahuika](../The_NeSI_High_Performance_Computers/Mahuika.md) | Slurm | NeSI users | Currently available |
| 7 A100-1g.5gb instances (1 NVIDIA A100 PCIe 40GB card divided into [7 MIG GPU slices](https://www.nvidia.com/en-us/technologies/multi-instance-gpu/) with 5GB memory each) | Development and debugging | [Mahuika](Mahuika.md) | Slurm and [Jupyter](../Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md) | NeSI users | Currently available |
| 5 NVIDIA Tesla P100 PCIe 12GB (5 nodes with 1 GPU) | Post-processing | [Māui Ancil](Maui_Ancillary.md) | Slurm | NeSI users | Currently available |
| 4 NVIDIA HGX A100 (4 GPUs per board with 80GB memory each, 16 A100 GPUs in total) | Large-scale Machine Learning (ML) applications | [Mahuika](Mahuika.md) | Slurm | NeSI users | Available as part of the [Milan Compute Nodes](https://support.nesi.org.nz/knowledge/articles/6367209795471) |
| 4 NVIDIA A40 with 48GB memory each (2 nodes with 2 GPUs, but capacity for 6 additional GPUs already in place) | Teaching / training | Flexible HPC | [Jupyter](../Interactive_computing_using_Jupyter/Jupyter_on_NeSI.md), VM, or bare metal tenancy possible (flexible) | Open to conversations with groups who could benefit from these | In development. |
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ hidden: false
position: 3
tags:
- hpc
- info
- mahuika
- cs400
title: Mahuika
Expand All @@ -14,15 +13,6 @@ zendesk_article_id: 360000163575
zendesk_section_id: 360000034335
---



[//]: <> (REMOVE ME IF PAGE VALIDATED)
[//]: <> (vvvvvvvvvvvvvvvvvvvv)
!!! warning
This page has been automatically migrated and may contain formatting errors.
[//]: <> (^^^^^^^^^^^^^^^^^^^^)
[//]: <> (REMOVE ME IF PAGE VALIDATED)

Mahuika is a Cray CS400 cluster featuring Intel Xeon Broadwell nodes,
FDR InfiniBand interconnect, and NVIDIA GPGPUs.

Expand All @@ -40,15 +30,15 @@ ssh to these nodes after logging onto the NeSI lander node.

## Notes

1. The Cray Programming Environment on Mahuika, differs from that on
1. The Cray Programming Environment on Mahuika, differs from that on
Māui.
2. The `/home, /nesi/project`, and `/nesi/nobackup`
2. The `/home, /nesi/project`, and `/nesi/nobackup`
[filesystems](../../Storage/File_Systems_and_Quotas/NeSI_File_Systems_and_Quotas.md)
are mounted on Mahuika.
3. Read about how to compile and link code on Mahuika in section
3. Read about how to compile and link code on Mahuika in section
entitled: [Compiling software on
Mahuika.](../../Scientific_Computing/HPC_Software_Environment/Compiling_software_on_Mahuika.md)
4. An extension to Mahuika with additional, upgraded resources is also
4. An extension to Mahuika with additional, upgraded resources is also
available. see [Milan Compute
Nodes](../../Scientific_Computing/Running_Jobs_on_Maui_and_Mahuika/Milan_Compute_Nodes.md)
for details on access
Expand Down Expand Up @@ -149,8 +139,6 @@ Rocky 8.5 on Milan</span></p></td>
</tbody>
</table>



##  Storage (IBM ESS)

| | |
Expand All @@ -162,7 +150,3 @@ Rocky 8.5 on Milan</span></p></td>
Scratch and persistent storage are accessible from Mahuika, as well as
from Māui and the ancillary nodes. Offline storage will in due course be
accessible indirectly, via a dedicated service.




Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,6 @@ zendesk_section_id: 360000034335
---



[//]: <> (REMOVE ME IF PAGE VALIDATED)
[//]: <> (vvvvvvvvvvvvvvvvvvvv)
!!! warning
This page has been automatically migrated and may contain formatting errors.
[//]: <> (^^^^^^^^^^^^^^^^^^^^)
[//]: <> (REMOVE ME IF PAGE VALIDATED)

Māui is a Cray XC50 supercomputer featuring Skylake Xeon nodes, Aries
interconnect and IBM ESS Spectrum Scale Storage. NeSI has access to 316
compute nodes on Māui.
Expand All @@ -35,13 +27,14 @@ analysis. To support workflows that are primarily single core jobs, for
example pre- and post-processing work, and to provide virtual lab
services, we offer a small number [Māui ancillary
nodes](https://support.nesi.org.nz/hc/articles/360000203776).
!!! prerequisite Tips

!!! tips
The computing capacity of the Māui ancillary nodes is limited. If you
think you will need large amounts of computing power for small jobs in
addition to large jobs that can run on Māui, please [contact
us](https://support.nesi.org.nz/hc/requests/new) about getting an
allocation on
[Mahuika](../../Scientific_Computing/The_NeSI_High_Performance_Computers/Mahuika.md),
[Mahuika](Mahuika.md),
our high-throughput computing cluster.

The login or build nodes maui01 and maui02 provide access to the full
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,41 +14,33 @@ zendesk_section_id: 360000034335
---



[//]: <> (REMOVE ME IF PAGE VALIDATED)
[//]: <> (vvvvvvvvvvvvvvvvvvvv)
!!! warning
This page has been automatically migrated and may contain formatting errors.
[//]: <> (^^^^^^^^^^^^^^^^^^^^)
[//]: <> (REMOVE ME IF PAGE VALIDATED)

The Māui Ancillary Nodes provide access to a Virtualised environment
that supports:

1. Pre- and post-processing of data for jobs running on the
[Māui](https://support.nesi.org.nz/hc/articles/360000163695)
1. Pre- and post-processing of data for jobs running on the
[Māui](Maui.md)
Supercomputer or
[Mahuika](https://support.nesi.org.nz/hc/articles/360000163575) HPC
[Mahuika](Mahuika.md) HPC
Cluster. Typically, as serial processes on a Slurm partition running
on a set of Ancillary node VMs or baremetal servers.
2. Virtual laboratories that provide interactive access to data stored
2. Virtual laboratories that provide interactive access to data stored
on the Māui (and Mahuika) storage together with domain analysis
toolsets (e.g. Seismic, Genomics, Climate, etc.). To access the
Virtual Laboratory nodes, users will first logon to the NeSI Lander
node, then ssh to the relevant Virtual Laboratory. Users may submit
jobs to Slurm partitions from Virtual Laboratory nodes.
3. Remote visualisation of data resident on the filesystems.
4. GPGPU computing.
3. Remote visualisation of data resident on the filesystems.
4. GPGPU computing.

Scientific Workflows may access resources across the Māui Supercomputer
and any (multi-cluster) Slurm partitions on the Māui or Mahuika systems.

## Notes:
## Notes

1. The `/home, /nesi/project`, and `/nesi/nobackup`
1. The `/home, /nesi/project`, and `/nesi/nobackup`
[filesystems](https://support.nesi.org.nz/hc/articles/360000177256)
are mounted on the Māui Ancillary Nodes.
2. The Māui Ancillary nodes have Skylake processors, while the Mahuika
2. The Māui Ancillary nodes have Skylake processors, while the Mahuika
nodes use Broadwell processors.

## Ancillary Node Specifications
Expand All @@ -66,14 +58,12 @@ and any (multi-cluster) Slurm partitions on the Māui or Mahuika systems.
| **Workload Manager** | Slurm (Multi-Cluster) |
| **OpenStack** | The Cray CS500 Ancillary nodes will normally be presented to users as Virtual Machines, provisioned from the physical hardware as required. |



The Māui\_Ancil nodes have different working environment than the Māui
The Māui_Ancil nodes have different working environment than the Māui
(login) nodes. Therefore a CS500 login node is provided, to create and
submit your jobs on this architecture. To use you need to login from
Māui login nodes to:

``` sl
``` sh
w-mauivlab01.maui.nesi.org.nz
```

Expand All @@ -82,7 +72,7 @@ could add the following section to `~/.ssh/config` (extending the
[recommended terminal
setup](https://support.nesi.org.nz/hc/en-gb/articles/360000625535-Recommended-Terminal-Setup))

``` sl
``` sh
Host w-mauivlab01
User <username>
Hostname w-mauivlab01.maui.nesi.org.nz
Expand All @@ -91,4 +81,4 @@ Host w-mauivlab01
ForwardX11Trusted yes
ServerAliveInterval 300
ServerAliveCountMax 2
```
```
Loading
Loading