Install NVIDIA public drivers
If the AWS Marketplace AMIs described in Use AMIs that include NVIDIA drivers don't fit your use case, you can install the public drivers and bring your own license. Installation options include the following:
-
Option 2: Install with the CUDA toolkit (recommended for Linux distributions)
P6-B200 instance type considerations
The P6-B200 platform is unique in that it exposes Mellanox ConnectX 7 network interface cards (NICs) to the instance as PCIe devices. These CX7 NICs do not act as typical network interfaces but instead function as NVSwitch bridges providing a control path to initialize and configure the NVFabric, which is the NVLink topology of the GPU interconnect.
To fully initialize the system, the NVIDIA Fabric Manager must configure
NVFabric
and establish the NVSwitch topology. This enables
InfiniBand kernel modules to communicate with the CX7 devices.
NVIDIA Fabric Manager is included in the CUDA toolkit. We recommend Option 2: Install with the CUDA toolkit for this instance type.
Option 1: Driver-only install
To install a specific driver, log on to your instance and download the 64-bit
NVIDIA public driver for the instance type from http://www.nvidia.com/Download/Find.aspx
Then follow the Local Repository Installation instructions in the NVIDIA Driver Installation Guide
Note
P6-B200 instance types require installation and configuration of additional packages that come bundled with the NVIDIA CUDA Toolkit. For more information, see instructions for your Linux distribution in Option 2: Install with the CUDA toolkit.
Instance | Product type | Product series | Product | Minimum driver version |
---|---|---|---|---|
G3 | Tesla | M-Class | M60 | -- |
G4dn | Tesla | T-Series | T4 | -- |
G5 | Tesla | A-Series | A10 | 470.00 or later |
G5g1 | Tesla | T-Series | NVIDIA T4G | 470.82.01 or later |
G6 | Tesla | L-Series | L4 | 525.0 or later |
G6e | Tesla | L-Series | L40S | 535.0 or later |
Gr6 | Tesla | L-Series | L4 | 525.0 or later |
P2 | Tesla | K-Series | K80 | -- |
P3 | Tesla | V-Series | V100 | -- |
P4d | Tesla | A-Series | A100 | -- |
P4de | Tesla | A-Series | A100 | -- |
P5 | Tesla | H-Series | H100 | 530 or later |
P5e | Tesla | H-Series | H200 | 550 or later |
P5en | Tesla | H-Series | H200 | 550 or later |
P6-B2002 | Tesla | HGX-Series | B200 | 570 or later |
P6e-GB200 | Tesla | HGX-Series | B200 | 570 or later |
1 The operating system for G5g instances is Linux aarch64.
2 For P6-B200 instance types, there are additional installation requirements to configure NVIDIA Fabric Manager.
Option 2: Install with the CUDA toolkit
Install instructions vary slightly by operating system. To install public drivers on your
instance with the NVIDIA CUDA toolkit, follow the instructions for your instance operating system.
For instance operating systems that aren't shown here, follow the instructions for your operating
system and instance type architecture on the NVIDIA Developer website. For more information, see
CUDA Toolkit Downloads
For instance type architecture or other specifications, see the Accelerated computing specifications in the Amazon EC2 Instance Types reference.
This section covers an NVIDIA CUDA toolkit install on an Amazon Linux 2023
instance. The command examples in this section are based on an x86_64
architecture.
For arm64-sbsa
commands, see CUDA Toolkit Downloads
Prerequisite
Before installing the toolkit and drivers, run the following command to ensure that you have the correct version of the kernel headers and development packages.
[ec2-user ~]$
sudo dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r) -y
Download the toolkit and drivers
Choose the type of installation to use for your instance, and follow the associated steps.
Remaining steps are the same for both local and network installation.
-
Complete the CUDA toolkit install
[ec2-user ~]$
sudo dnf clean all[ec2-user ~]$
sudo dnf install cuda-toolkit -y -
Install the open kernel module variant of the driver
[ec2-user ~]$
sudo dnf module install nvidia-driver:open-dkms -y -
Install GPUDirect Storage and Fabric Manager
[ec2-user ~]$
sudo dnf install nvidia-gds -y[ec2-user ~]$
sudo dnf install nvidia-fabric-manager -y -
Enable Fabric Manager and driver persistence
[ec2-user ~]$
sudo systemctl enable nvidia-fabricmanager[ec2-user ~]$
sudo systemctl enable nvidia-persistenced -
Additional configuration for P6-B200 instance types:
P6-B200 instance types require installation and configuration of additional packages that come bundled with the NVIDIA CUDA Toolkit.
-
Install NVIDIA Link Subnet Manager and
ibstat
.[ec2-user ~]$
sudo dnf install nvlink5 -
Enable automatic loading of the Infiniband module on startup.
[ec2-user ~]$
echo "ib_umad" | sudo tee -a /etc/modules-load.d/modules.conf
-
-
Reboot the instance
[ec2-user ~]$
sudo reboot
This section covers an NVIDIA CUDA toolkit install on an Ubuntu 24.04 instance.
The command examples in this section are based on an x86_64
architecture.
For arm64-sbsa
commands, see CUDA Toolkit Downloads
Prerequisite
Before installing the toolkit and drivers, run the following command to ensure that you have the correct version of the kernel headers and development packages.
$
apt install linux-headers-$(uname -r)
Download the toolkit and drivers
Choose the type of installation to use for your instance, and follow the associated steps.
Remaining steps are the same for both local and network installation.
-
Complete the CUDA toolkit install
$
sudo apt update$
sudo apt install cuda-toolkit -y -
Install the open kernel module variant of the driver
$
sudo apt install nvidia-open -y -
Install GPUDirect Storage and Fabric Manager
$
sudo apt install nvidia-gds -y$
sudo apt install nvidia-fabricmanager -y -
Enable Fabric Manager and driver persistence
$
sudo systemctl enable nvidia-fabricmanager$
sudo systemctl enable nvidia-persistenced -
Additional configuration for P6-B200 instance types:
P6-B200 instance types require installation and configuration of additional packages that come bundled with the NVIDIA CUDA Toolkit.
-
Install the latest InfiniBand-specific device driver (
mlx5_ib
) and diagnostic utilities.$
sudo apt install linux-modules-extra-$(uname -r) -y$
sudo apt install infiniband-diags -y -
Install NVIDIA Link Subnet Manager.
$
sudo apt install nvlsm -y
-
-
Reboot the instance
sudo reboot
-
Update your path and add the following environment variable.
$
export PATH=${PATH}:/usr/local/cuda-13.0
/bin$
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-13.0
/lib64
To install the NVIDIA driver on Windows, follow these steps:
-
Open the folder where you downloaded the driver and launch the installation file. Follow the instructions to install the driver and reboot your instance as required.
-
Disable the display adapter named Microsoft Basic Display Adapter that is marked with a warning icon using Device Manager. Install these Windows features: Media Foundation and Quality Windows Audio Video Experience.
Important
Don't disable the display adapter named Microsoft Remote Display Adapter. If Microsoft Remote Display Adapter is disabled your connection might be interrupted and attempts to connect to the instance after it has rebooted might fail.
-
Check Device Manager to verify that the GPU is working correctly.
-
To achieve the best performance from your GPU, complete the optimization steps in Optimize GPU settings on Amazon EC2 instances.