Neurox Workload cluster

The Neurox Workload management cluster is where GPU workloads run on GPU nodes. When deployed standalone, it does not require ingress nor persistent disk. Typically, the Neurox Workload components are installed together with Neurox Control plane components in a single combined Kubernetes cluster.

This page outline the requirements needed to deploy standalone Neurox Workload components into additional Kubernetes GPU clusters. Neurox Workload can autodetect many Cloud Service Provider (CSP) environments, automatically surfacing metadata such as region or availability zone, as well as identify models of GPUs attached.

Multi-Cluster setup

One of the best features of Neurox is monitoring multiple Neurox Workload clusters from a single Neurox Control plane. Common use cases include joining GPU clusters from various cloud providers or even on-prem clusters.

Please see our pricing plans to determine how many Neurox Workload clusters may be joined into a Neurox Control cluster.

Cluster requirements

  • Kubernetes and CLI 1.29+

  • Helm CLI 3.8+

  • 4 CPUs

  • 8 GB of RAM

  • At least 1 GPU node

You will need both NVIDIA GPU Operator and Kube Prometheus Stack to run the Neurox workload chart.

NVIDIA GPU Operator

Required to run GPU workloads. Install with:

For more information on how to configure NVIDIA GPU operator: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html#procedure

Kube Prometheus Stack

Required to gather metrics. Install with:

For more information on how to configure kube-prometheus-stack: https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics

Credentials

  • Your Neurox subdomain

  • Your Neurox Workload auth secret (provided by Neurox Control)

  • Your Neurox registry username and password

Install

To join a Neurox Workload cluster to an existing Neurox Control cluster, you can obtain the install script by going to your Neurox Control portal > Clusters > New Cluster button and a fully generated install script (with auth secret) will be available to copy/paste.

The example below was based on the output of the generated install script:

Last updated