This document details how to use the ECEN Olympus cluster to remotely access software used in academic Linux labs and for research.
What is the Cluster
The Olympus cluster consists of the login node (olympus.ece.tamu.edu), eight non-GPU compute nodes and five GPU compute nodes. The cluster has software that ensures users receive the resources needed for their labs by distributing users across the compute nodes based on their course requirements. There is limited software installed on the Olympus head node.
Nodes 1-5 Poweredge 730XD - dual Xeon(R) CPU E5-2650 v3 - 20 cores (40 with HT) per node, 256GB RAM
100 core total
Nodes 6-8 Poweredge R6525- Dual AMD EPYC 7443 - 48 cores (96 with HT) per node, 256GB RAM
144 core total
Nodes 9-11 Poweredge C4140 - Dual Xeon(R) Gold 6130 - 32 cores (64 with HT) per node, 196GB RAM, 4 Tesla V100’s per node
96 core and 12 V100 total
Nodes 12-13 PowerEdge R750xa - Dual Xeon(R) Gold 6326 - 32 cores (64 with HT) per node, 256GB RAM, 4 Tesla A100’s per node
64 core and 8 A100 total
Cluster Configuration and Usage Limitations
To assure resources are available to all students, the following limitations are enforced.
Nodes are grouped into partitions. The following partitions are configured.
CPU Nodes: nodes 1-8. Nodes 1-5 have academic priority (jobs will run on these nodes first)
CPU-RESEARCH: Nodes 6-8 research jobs will run on these nodes - requires PI approval
GPU: nodes 9-13 for coursework and research - requires PI approval
Resource allocation is set using Quality of Server (qos) in slurm.
QOS name | Hardware Limits | Default Time Limits | Hard Time Limit | Partition |
Ugrad (academic) | 4 cpu cores | 12 hours | 12 hours | CPU |
Grad (academic) | 6 cpu cores | 12 hours | 12 hours | CPU |
Research | 12 cpu cores | 48 hours | 48 hours | CPU-Research |
Ecen-ugrad-gpu | 8 cpu, 1gpu | 36 hours | 36 hours | GPU |
Olympus-research-gpu | 32 cpu, 4gpu | 4 days | 4days | GPU |
Olympus-research-gpu2 | 160 cpu/20 gpu | 7 days | 21 days | GPU |