TL;DR
Linux users with Nvidia GPUs can now use their VRAM as swap space through a user-space daemon that interfaces via CUDA. This approach increases available memory, especially for hybrid graphics laptops, without kernel modifications.
Linux users with Nvidia GPUs can now use their GPU’s VRAM as swap space through a new user-space daemon, providing a high-priority, high-speed extension of system memory without kernel modifications.
This development involves a daemon that allocates VRAM via the CUDA driver API and exposes it as a block device using the Network Block Device (NBD) protocol over a Unix socket. The kernel’s built-in NBD driver then maps this device as swap space, allowing the system to utilize GPU VRAM as additional memory. The approach does not require kernel module changes or access to Nvidia kernel symbols, making it compatible with most Linux distributions and driver versions. Tested on an AMD/ATI + Nvidia RTX 3070 laptop running Pop!_OS with kernel 6.17, the setup successfully allocated up to 7 GB of VRAM as swap, significantly increasing total addressable memory from the default configuration.
The process involves a small daemon that communicates with the CUDA driver to allocate VRAM, then serves it as a block device via NBD, which the Linux swap subsystem can use. This method circumvents limitations faced by direct peer-to-peer (P2P) APIs, which often return errors or are gated at the hardware resource management level, especially on consumer GPUs. Benchmarks indicate that VRAM-based swap offers lower latency than traditional SSD swap, but throughput for large sequential transfers is slower due to overhead from socket communication and CUDA memory copying. The approach is designed to be resilient to driver and kernel updates, with no need for kernel module recompilation.
Why It Matters
This development is significant because it provides a new way to extend system memory on laptops and desktops with Nvidia GPUs, especially hybrid systems where the integrated GPU handles display and the discrete GPU remains underutilized. It can improve performance for memory-intensive workloads and potentially reduce reliance on slower storage-based swap. Additionally, it offers a practical solution without requiring kernel modifications or privileged driver access, making it accessible to a broad user base.
However, it is not suitable for all workloads, as benchmarks show that for sustained large transfers, traditional NVMe swap remains faster. The method is particularly advantageous for sporadic, high-priority memory needs where low latency is critical.

NVIDIA GeForce RTX 5080 Founders Edition
NVIDIA Blackwell Architecture The Ultimate Platform for Gamers and Creators Tensor Cores Max AI Performance with FP4 and…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
Traditional Linux swap relies on disk or SSD devices, which can introduce latency and performance bottlenecks. Learn more about GPU and system memory management. While some projects attempted to use Nvidia’s peer-to-peer (P2P) APIs to pin VRAM for direct access, these efforts have largely failed on consumer GPUs due to driver restrictions and hardware limitations. The new approach leverages the CUDA API to allocate VRAM and serve it as a block device over NBD, sidestepping these restrictions. This builds on recent trends of using GPU memory for general-purpose computing and memory extension, especially on laptops with soldered RAM and limited upgrade options.
The software, available on GitHub, has been tested on a specific hardware configuration (RTX 3070 Laptop, 16 GB RAM, 8 GB VRAM) with driver 580.159.03 and Linux kernel 6.17. Prior attempts to directly map VRAM via P2P APIs faced errors such as EINVAL, prompting this alternative approach.
“This method lets Linux systems treat GPU VRAM as swap without kernel modifications, providing a significant boost for hybrid laptops.”
— c0dejedi, developer of the tool
“While VRAM swap offers lower latency for sporadic access, its throughput for large sequential data is limited by the overhead of socket communication and CUDA memory copying.”
— an independent Linux performance analyst
GPU memory swap space Linux
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It is not yet clear how well this approach scales across different GPU models, driver versions, or Linux distributions. For insights into Nvidia’s hardware and software strategies, see our latest analysis. Compatibility with non-Nvidia GPUs or newer driver updates remains untested. The long-term stability and security implications of exposing VRAM as swap are also still unknown, as the method relies on user-space communication and does not involve kernel modifications.

GPU-Accelerated Computing with Python 3 and CUDA: From low-level kernels to real-world applications in scientific computing and machine learning
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
Further testing across various hardware configurations and workloads is expected. Developers may refine the tool to improve throughput and compatibility. Future updates could integrate more advanced management features, such as dynamic VRAM allocation or better power management. Monitoring the community’s adoption and feedback will determine its viability as a mainstream solution for extending system memory.

FCZFCZ RC30-0370 Battery 61.6Wh 4003mAh Replacement for Razer Blade 14 2021 2022 14 AMD Ryzen 9 5900HX Nvidia GeForce RTX 3060 3070 3080 Series RZ09-0368 RZ09-0370 RZ09-0427 15.4V 2-Cell
【Specifications】Battery Model: RC30-0370 // Voltage:15.4V // Capacity:61.6Wh 4003mAh 2-cell // Color:Black // Condition:Brand New.
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Can I use this method with any Nvidia GPU?
It requires a CUDA-capable Nvidia GPU and driver version 580.159.03 or later. Compatibility with all models is not guaranteed, but it has been tested on RTX 3070 laptops.
Will this affect system stability or GPU performance?
As a user-space implementation, it is designed to be safe and to survive driver updates. However, using VRAM as swap may introduce performance overhead and is not suitable for all workloads.
Is this method suitable for production systems?
Currently, it is experimental and primarily intended for testing and development. Caution is advised before deploying on critical systems.
How does this compare to traditional SSD swap?
VRAM swap offers lower latency for sporadic access but slower throughput for large sequential transfers compared to NVMe SSDs. It is best suited for high-priority, low-latency memory needs.
Source: Hacker News