How to Install and Configure AMD ROCm on Ubuntu 26.04 for AI and Deep Learning

🌐 🇵🇱 Polski · 🇬🇧 EN

A step‑by‑step guide to installing the AMD ROCm platform on Ubuntu 26.04. Learn how to configure the environment for PyTorch and TensorFlow to fully exploit the potential of Radeon GPUs in artificial intelligence computations.

Stacja robocza przystosowana do obliczeń AI i głębokiego uczenia z kartami graficznymi AMD Radeon. — Configuring a modern workstation with AMD GPU acceleration is the key to efficient AI computing directly on a local Ubuntu system.

Introduction: Why AMD ROCm is becoming a key player in the AI world?

For many years the hardware and software market for deep learning and artificial intelligence was dominated by a single solution – NVIDIA’s CUDA architecture. This monopoly is gradually giving way to alternative solutions. In the face of rising accelerator prices and the need to build local, independent compute environments, AI engineers and researchers are increasingly turning to AMD’s ROCm (Radeon Open Compute) platform.

AMD ROCm is an open‑source software platform designed for high‑performance computing (HPC) and machine learning. It enables developers to directly harness the compute power of AMD Radeon and Instinct GPUs. Thanks to rapid development, ROCm now offers native or near‑native support for the most popular frameworks such as PyTorch and TensorFlow, becoming a viable and cost‑effective alternative to CUDA‑based environments. Choosing a stable operating system like Ubuntu 26.04 LTS as the AI and machine‑learning operating system provides an ideal foundation for building such a workstation or server.

This comprehensive guide will walk you through the entire process: from hardware verification, through kernel driver and ROCm package installation on Ubuntu 26.04, to configuring Python virtual environments and performance testing in PyTorch and TensorFlow.

1. System and hardware requirements

Before starting the installation, ensure that your hardware and system configuration meet ROCm’s requirements. Although AMD continuously expands the list of supported chips, the platform imposes specific architectural constraints.

Supported graphics cards (GPU)

Official AMD ROCm support focuses on professional cards and selected consumer models based on RDNA and CDNA architectures:

AMD Instinct series: MI300, MI200, MI100 (dedicated data‑center accelerators).
Radeon Pro professional cards: W7000, W6800 series and newer.
Consumer Radeon RX cards: Official and stable support includes RDNA 3‑based models (e.g., Radeon RX 7900 XTX, RX 7900 XT). RDNA 2‑based models (e.g., RX 6900 XT) also work very well, although they may require setting appropriate environment variables at runtime (the so‑called override of the architecture version).

Hardware and system requirements

Operating system: A clean installation of Ubuntu 26.04 LTS (64‑bit).
CPU: x86_64 processor with PCIe Gen 3 or Gen 4 support (Gen 4 is recommended for optimal GPU memory bandwidth).
RAM: Minimum 16 GB (32 GB or more is recommended, especially when working with large language models).
Power and cooling: Stable power supply sized for the GPU’s TDP and adequate case ventilation. AI workloads can drive the GPU at 100 % for many hours.

Important note regarding kernel version: Ubuntu 26.04 ships with a new Linux kernel. Before installing, make sure you use the latest ROCm release that targets this Ubuntu version to avoid DKMS module conflicts with the kernel structure.

2. Preparing Ubuntu 26.04 for installation

The first step is comprehensive system preparation. We need to ensure the system is fully updated and any potential conflicts with older drivers are eliminated.

Step 2.1: Updating system packages

Open a terminal and run the following commands to refresh the package list and install the latest security patches:

sudo apt update && sudo apt upgrade -y

After the update finishes, a reboot is recommended, especially if the kernel was updated:

sudo reboot

Step 2.2: Installing required dependencies

For proper DKMS kernel module compilation and repository key retrieval we need basic development tools. Install them with:

sudo apt install -y wget gpg cmake build-essential dkms linux-headers-$(uname -r) python3-pip python3-venv

If you plan to deploy this in production on many machines, consider automating the process. More about creating installation scripts can be found in the article on Linux server automation with Bash scripts.

3. Installing AMDGPU drivers and the ROCm platform

ROCm installation relies on the official AMD repository. The process is divided into adding cryptographic keys, configuring the package source, and installing the appropriate meta‑packages.

Step 3.1: Adding AMD repository GPG key

To make Ubuntu trust packages from AMD’s servers, we must import their official GPG key. Create a directory for the keys and fetch it with the commands below:

sudo mkdir -p /etc/apt/keyrings
wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null

Step 3.2: Adding the official ROCm repository

Depending on the exact ROCm version you want to install (we recommend the latest stable release, e.g., ROCm 6.x), create the appropriate source list file. Run the following command in the terminal:

echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.0/ ubuntu noble main" | sudo tee /etc/apt/sources.list.d/rocm.list

Note: At the time of Ubuntu 26.04 release, AMD adapts repository paths. If the dedicated version for your release uses a different codename, replace “noble” (or its equivalent) with the correct distribution identifier.

Step 3.3: Configuring package priorities (Apt Pinning)

To prevent the system from pulling incompatible graphics drivers from Ubuntu’s default repositories instead of AMD’s packages, set a priority for the ROCm repository. Create the configuration file:

sudo tee /etc/apt/preferences.d/rocm-pin-600 <<EOF
Package: *
Pin: release o=repo.radeon.com
Pin-Priority: 600
EOF

Step 3.4: Updating the package database and installing

After adding the repository, refresh the package database:

sudo apt update

Now we can proceed with installing the kernel drivers and the core ROCm stack. The safest method is to install the meta‑package rocm, which automatically configures the required libraries:

sudo apt install -y amdgpu-dkms rocm

This process may take from a few minutes up to a dozen minutes, as the system will compile DKMS kernel modules matched to the currently running kernel. Ensure no compilation errors appear during installation.

4. Configuring permissions and environment variables

Installing the packages alone is not enough for a regular user to run GPU compute tasks without root privileges. We need to configure user groups and system paths.

Step 4.1: Adding the user to the render and video groups

On Linux, direct access to 3D hardware acceleration and GPGPU compute is controlled by the system groups render and video. Add your current user to these groups:

sudo usermod -aG render $USER
sudo usermod -aG video $USER

Important: To apply the group changes, log out and back in, or run a command that restarts the user session:

newgrp render
newgrp video

Step 4.2: Configuring environment variables (PATH)

ROCm executables (compilers, profilers, diagnostic tools) are installed by default in the directory /opt/rocm/bin. To access them conveniently from any terminal location, add this path to your shell configuration file (e.g., .bashrc or .zshrc):

echo 'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/opencl/bin' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/rocm/lib:/opt/rocm/lib64' >> ~/.bashrc
source ~/.bashrc

5. Verifying the installation

Before launching AI models we must absolutely confirm that the operating system correctly communicates with the GPU via ROCm drivers.

Step 5.1: Testing the rocminfo tool

Run the command rocminfo in a terminal. It should return detailed information about the system architecture and detected graphics processors:

rocminfo

If the installation succeeded, you will see sections describing your GPU (e.g., Agent 2, Name: gfx1100 for RDNA 3 cards). No errors here indicate that the hardware layer is functioning correctly.

Step 5.2: Monitoring GPU state with rocm-smi

The tool rocm-smi (System Management Interface) monitors temperature, power consumption, core clock, and VRAM usage. Run it by typing:

rocm-smi

You should see a table similar to what NVIDIA users know from the nvidia-smi utility. It’s an excellent tool for tracking resource usage while training models.

6. Configuring the AI compute environment: PyTorch and TensorFlow

With the system layer and drivers working flawlessly, we can move on to installing the machine‑learning frameworks. Best practice is to isolate these libraries in dedicated Python virtual environments.

Step 6.1: Preparing a virtual environment

Create a separate directory for AI projects and initialise a Python virtual environment inside it:

mkdir ~/ai_projects
cd ~/ai_projects
python3 -m venv rocm_env
source rocm_env/bin/activate

Remember to always activate this environment before installing libraries using the command source rocm_env/bin/activate.

Step 6.2: Installing and configuring PyTorch with ROCm support

The official PyTorch team provides pre‑built packages compiled directly for ROCm libraries. Do not install the standard PyTorch via the regular pip install torch, as it will pull a CUDA‑compiled version. Use AMD’s dedicated package index:

pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0

Note: Adjust the ROCm version in the URL (e.g., rocm6.0) to match the version you actually installed.

Verifying PyTorch on an AMD GPU

After installation, launch an interactive Python console and check whether PyTorch sees your GPU:

python3 -c "import torch; print('Czy ROCm jest dostępny?', torch.cuda.is_available()); print('Nazwa karty:', torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'Brak GPU')"

Note an interesting technical nuance: PyTorch’s API retains CUDA‑related naming (e.g., torch.cuda.is_available()), but under the hood all calls are translated to ROCm (HIP) instructions. If the script returns True and the correct GPU name, your PyTorch environment is fully ready!

Step 6.3: Installing and configuring TensorFlow with ROCm support

TensorFlow also offers ROCm support, though the process can be more version‑dependent. AMD actively maintains a dedicated ROCm‑enabled TensorFlow build named tensorflow-rocm.

Within the active virtual environment, install the appropriate package:

pip install tensorflow-rocm

To verify TensorFlow’s installation and hardware acceleration detection, run the following one‑liner:

python3 -c "import tensorflow as tf; print('Dostępne urządzenia:', tf.config.list_physical_devices('GPU'))"

If a GPU object appears in the device list, the configuration succeeded.

7. Advanced tips and performance optimization

Working with local AI models—especially large ones like LLMs (e.g., Llama 3, Mistral) or image generators (Stable Diffusion)—requires optimal system‑resource management.

Environment variable HSA_OVERRIDE_GFX_VERSION

If you own a consumer GPU (e.g., Radeon RX 6700 XT or RX 7800 XT) that is not on the official enterprise‑supported list, ROCm may report a device initialization error. The workaround is to force the runtime library to treat your card as a fully supported model with a similar architecture.

For RDNA 3 cards (e.g., RX 7800/7700) add to your environment:

export HSA_OVERRIDE_GFX_VERSION=11.0.0

For RDNA 2 cards (e.g., RX 6800/6700) use:

export HSA_OVERRIDE_GFX_VERSION=10.3.0

You can permanently append this line to your file ~/.bashrc, saving you from typing it each time you open a terminal.

Managing system resources

Intensive neural‑network training can push the system to critical RAM and CPU usage, sometimes causing the graphical environment to freeze. To prevent this, familiarize yourself with techniques for limiting resources per process. Useful tips are available in our article on how to limit CPU and RAM usage by processes on Linux.

8. Troubleshooting common issues

Despite software advances, installing advanced graphics drivers on Linux can still be challenging. Below are the most frequent problems you may encounter, together with ready‑made solutions:

Error: “Permission denied” when trying to access the GPU

Symptoms: rocminfo returns a permission‑denied error or PyTorch reports CUDA is unavailable despite correct driver installation.

Solution: Ensure your user is correctly added to groups video and render. Run groups in a terminal to verify group membership. If the groups are missing, repeat step 4.1 and reboot the machine.

Error: DKMS compilation failed during amdgpu-dkms installation

Symptoms: The installer reports a kernel module compilation error and aborts.

Solution: The most common cause is missing kernel headers matching the running kernel or a kernel version that is too new for the current DKMS package. Install the package linux-headers-$(uname -r). If the issue persists, consider booting with an older, stable kernel (selectable from the GRUB menu).

Error: GPU not detected in PyTorch despite correct rocminfo

Symptoms: System tools work, but Python code returns false for torch.cuda.is_available().

Solution: Most likely you installed the standard PyTorch from PyPI instead of the AMD ROCm build. Uninstall the packages with pip uninstall torch torchvision torchaudio, then reinstall following step 6.2, using the exact AMD package index URL.

Conclusion: The future of AI in AMD colors

Installing the AMD ROCm platform on Ubuntu 26.04 opens entirely new possibilities for AI enthusiasts and professionals. This configuration grants access to the massive compute power of modern Radeon cards while bypassing the licensing constraints and high costs of competing ecosystems. Although ROCm setup demands a bit more attention than CUDA, the stability and economic benefits fully compensate for the effort.

With a ready‑to‑go PyTorch and TensorFlow environment you can seamlessly embark on advanced projects—from running local language models to training your own deep neural networks.

Frequently Asked Questions (FAQ)

Can I use AMD ROCm on laptops with Radeon cards?

Officially AMD ROCm is designed and supported for desktop and server GPUs. While it is technically possible to run ROCm on some mobile APU processors and dedicated mobile GPUs (using the mentioned HSA_OVERRIDE_GFX_VERSION parameter), stability is limited and performance may not be sufficient for serious development work.

Does ROCm support older graphics cards, e.g., from the Polaris series (RX 580)?

Support for older architectures such as Polaris (RX 400/500) or Vega has been removed from newer ROCm releases (5.x and 6.x). To run compute on such legacy hardware you would need very old ROCm versions and older Ubuntu releases, which is not recommended for security or compatibility reasons.

Can I have AMD and NVIDIA CUDA drivers installed simultaneously on one computer?

Linux can technically host both proprietary NVIDIA drivers and open‑source AMD drivers. However, configuring an AI environment where both platforms coexist is extremely complex and prone to library conflicts. For AI workloads it is strongly advised to dedicate a workstation to a single hardware architecture.

How to update ROCm to a newer version in the future?

To upgrade ROCm, first safely remove the old packages, change the repository URL in file /etc/apt/sources.list.d/rocm.list to the new version, then clean and reinstall the meta‑package rocm. After a system upgrade you will also need to reinstall the PyTorch/TensorFlow libraries in your Python virtual environments to match the new API version.