Skip to main content

Linux Driver mmap Explained: Zero-Copy User–Kernel IO

·525 words·3 mins
Linux Device Driver Mmap IO
Table of Contents

In Linux device drivers, data is typically exchanged using copy_to_user() and copy_from_user(). While safe and simple, these APIs become a major performance bottleneck when transferring large buffers such as video frames, DMA data, or high-speed sensor streams.

The mmap mechanism solves this problem by allowing the same physical memory to be mapped into both kernel space and user space—eliminating redundant copies entirely.

🧠 Core Idea: Double Mapping and Zero Copy
#

User space and kernel space operate in separate virtual address spaces, but they can still reference the same physical pages.

With mmap:

  • The driver allocates a physical buffer
  • The kernel maps it into kernel virtual memory
  • The same physical pages are mapped into user virtual memory

Both sides now access identical memory, enabling zero-copy IO.

🧪 User-Space mmap Usage
#

From the application’s perspective, using mmap is straightforward once the device file is opened.

char *buf;
int fd = open("/dev/hello", O_RDWR);

/* Map 8 KB of device memory */
buf = mmap(NULL,
           8 * 1024,
           PROT_READ | PROT_WRITE,
           MAP_SHARED,
           fd,
           0);

if (buf == MAP_FAILED) {
    perror("mmap failed");
    return -1;
}

/* buf now directly accesses driver memory */

Key Parameters Explained
#

  • addr (NULL) Lets the kernel choose the virtual address.

  • length Must be page-aligned (multiple of PAGE_SIZE).

  • prot Access permissions (PROT_READ, PROT_WRITE).

  • flags MAP_SHARED is required so changes are visible to the driver.

Once mapped, reads and writes to buf bypass all copy operations.

🧩 Driver-Side mmap Implementation
#

On the driver side, mmap is implemented via the .mmap callback in file_operations. The critical kernel helper is remap_pfn_range().

remap_pfn_range() Prototype
#

int remap_pfn_range(struct vm_area_struct *vma,
                    unsigned long addr,
                    unsigned long pfn,
                    unsigned long size,
                    pgprot_t prot);
Parameter Meaning
vma User virtual memory area
addr User virtual start address
pfn Page Frame Number (physical » PAGE_SHIFT)
size Mapping length
prot Page protection and cache attributes

🧷 Driver Example
#

static int hello_drv_mmap(struct file *file,
                          struct vm_area_struct *vma)
{
    unsigned long phys;

    /* Convert kernel virtual address to physical */
    phys = virt_to_phys(kernel_buf);

    /* Optional: adjust caching behavior */
    vma->vm_page_prot =
        pgprot_writecombine(vma->vm_page_prot);

    /* Map physical pages into user space */
    if (remap_pfn_range(vma,
                        vma->vm_start,
                        phys >> PAGE_SHIFT,
                        vma->vm_end - vma->vm_start,
                        vma->vm_page_prot)) {
        pr_err("mmap remap_pfn_range failed\n");
        return -EAGAIN;
    }

    return 0;
}

static const struct file_operations hello_fops = {
    .owner = THIS_MODULE,
    .mmap  = hello_drv_mmap,
};

⚙️ Key Design Requirements
#

When implementing driver mmap, the following rules are critical:

  1. Physically Contiguous Memory Use kmalloc() or kzalloc() (not vmalloc()).

  2. PFN Conversion Physical address must be shifted by PAGE_SHIFT.

  3. Page Alignment Mapping size must align to page boundaries.

  4. Cache Attributes For DMA or shared buffers, use write-combining or non-cached mappings.

  5. Security Only map memory intended for user access—never arbitrary kernel memory.

🧾 mmap vs copy_to_user()
#

Method Copies Data Performance Use Case
copy_to_user() Yes Lower Small or infrequent transfers
mmap() No High Video, DMA, shared buffers

🧠 Summary
#

The mmap operation is one of the most powerful techniques in Linux driver development. By mapping physical memory directly into user space, drivers can achieve:

  • Zero-copy data transfer
  • Lower CPU utilization
  • Dramatically improved throughput

For high-bandwidth or low-latency IO paths, mmap is not an optimization—it is a necessity.

Related

Linux Boot Process Explained: From Power-On to Kernel
·657 words·4 mins
Linux Boot Kernel
Debian vs Ubuntu vs RHEL: Choosing the Right Linux Distro
·529 words·3 mins
Linux Debian Ubuntu RHEL
RTX 3060 CUDA Setup on Ubuntu 22.04 (CUDA 11.6)
·468 words·3 mins
Linux Nvidia Ubuntu 22.04 CUDA CuDNN