I/O devices doesn't use virtual address, neither physical address, they use a third kind of address: a "bus address".

Data transfer can be triggered in two ways:

  • Either the software asks for data (via a function such as read())
  • or the hardware asynchronously pushes data to the system.

一个比较具体的有 DMA (third-party DMA) 流程是这样的:

  • Userspace 有了 read() 系统调用;
  • Kernel driver setup DMAC channel 相关的寄存器(Address, Length, DMA Direction^)。
  • DMAC 和磁盘进行协作,将磁盘数据写入内存区域,CPU 全程不参与此过程(此时 CPU 是不是可以调度到其他线程做其他事情?毕竟这个 block 住了,答案是可以的);
  • DMAC 向 CPU 发出数据传输完成的信号,阻塞的进程返回,由 CPU 负责将数据从内核缓冲区拷贝到用户缓冲区
  • 用户进程由内核态切换回用户态,解除阻塞状态。

A simple PCI DMA example

The actual form of DMA operations on the PCI bus is very dependent on the device being driven. Thus, this example does not apply to any real device; instead, it is part of a hypothetical driver called dad (DMA Acquisition Device). A driver for this device might define a transfer function like this,下面是 driver 里的 code:

int dad_transfer(struct dad_dev *dev, bool write, void *buffer,  size_t count)
{
    dma_addr_t bus_addr;
    unsigned long flags;

    // Map the buffer for DMA
    bus_addr = pci_map_single(dev->pci_dev, buffer, count, dev->dma_dir);
    //...

    // Set up the device
    // 设置 device 的 command 以及 device 上一些 register 的值比如 addr 和 len
    writeb(dev->registers.command, DAD_CMD_DISABLEDMA);
    writeb(dev->registers.command, write ? DAD_CMD_WR : DAD_CMD_RD);
    writel(dev->registers.addr, cpu_to_le32(bus_addr));
    writel(dev->registers.len, cpu_to_le32(count));

    // 通过写 device 的 register 来告诉 device 开始 DMA
    writeb(dev->registers.command, DAD_CMD_ENABLEDMA);
    return 0;
}

This function maps the buffer to be transferred and starts the device operation. The other half of the job must be done in the interrupt service routine, which would look something like this。这是在 interrupt handler 里的 code:

void dad_interrupt(int irq, void *dev_id, struct pt_regs *regs)
{
    struct dad_dev *dev = (struct dad_dev *) dev_id;

    /* Make sure it's really our device interrupting */

    /* Unmap the DMA buffer */
    pci_unmap_single(dev->pci_dev, dev->dma_addr, dev->dma_size, dev->dma_dir);

    /* Only now is it safe to access the buffer, copy to user, etc. */
    ...
}

Direct Memory Access and Bus Mastering - Linux Device Drivers, Second Edition [Book]

request DMA channel in driver:

request_dma

The channel argument is a number between 0 and 7 or, more precisely, a positive number less than MAX_DMA_CHANNELS. On the PC, MAX_DMA_CHANNELS is defined as 8, to match the hardware. The name argument is a string identifying the device. The specified name appears in the file /proc/dma, which can be read by user programs.

Your sound card and your analog I/O interface can share the DMA channel as long as they are not used at the same time.

Another example driver code using DMA to perform MEM-TO-MEM transfer

#include <linux/module.h>
#include <linux/init.h>
#include <linux/completion.h>
#include <linux/slab.h>
#include <linux/dmaengine.h>
#include <linux/dma-mapping.h>

/* Meta Information */
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Johannes 4 GNU/Linux");
MODULE_DESCRIPTION("A simple DMA example for copying data from RAM to RAM");

void my_dma_transfer_completed(void *param) 
{
	struct completion *cmp = (struct completion *) param;
	complete(cmp);
}

/**
 * @brief This function is called, when the module is loaded into the kernel
 */
static int __init my_init(void) {
	dma_cap_mask_t mask;
	struct dma_chan *chan;
	struct dma_async_tx_descriptor *chan_desc;
	dma_cookie_t cookie;
	dma_addr_t src_addr, dst_addr;
	u8 *src_buf, *dst_buf;
	struct completion cmp;
	int status;

	printk("my_dma_memcpy - Init\n");

	dma_cap_zero(mask);
	dma_cap_set(DMA_SLAVE | DMA_PRIVATE, mask);
	chan = dma_request_channel(mask, NULL, NULL);
	if(!chan) {
		printk("my_dma_memcpy - Error requesting dma channel\n");
		return -ENODEV;
	}

	src_buf = dma_alloc_coherent(chan->device->dev, 1024, &src_addr, GFP_KERNEL);
	dst_buf = dma_alloc_coherent(chan->device->dev, 1024, &dst_addr, GFP_KERNEL);

	memset(src_buf, 0x12, 1024);
	memset(dst_buf, 0x0, 1024);

	printk("my_dma_memcpy - Before DMA Transfer: src_buf[0] = %x\n", src_buf[0]);
	printk("my_dma_memcpy - Before DMA Transfer: dst_buf[0] = %x\n", dst_buf[0]);

	chan_desc = dmaengine_prep_dma_memcpy(chan, dst_addr, src_addr, 1024, DMA_MEM_TO_MEM);
	if(!chan_desc) {
		printk("my_dma_memcpy - Error requesting dma channel\n");
		status = -1;
		goto free;
	}

	init_completion(&cmp);

	chan_desc->callback = my_dma_transfer_completed;
	chan_desc->callback_param = &cmp;

	cookie = dmaengine_submit(chan_desc);

	/* Fire the DMA transfer */
	dma_async_issue_pending(chan);

	if(wait_for_completion_timeout(&cmp, msecs_to_jiffies(3000)) <= 0) {
		printk("my_dma_memcpy - Timeout!\n");
		status = -1;
	}

	status = dma_async_is_tx_complete(chan, cookie, NULL, NULL);
	if(status == DMA_COMPLETE) {
		printk("my_dma_memcpy - DMA transfer has completed!\n");
		status = 0;
		printk("my_dma_memcpy - After DMA Transfer: src_buf[0] = %x\n", src_buf[0]);
		printk("my_dma_memcpy - After DMA Transfer: dst_buf[0] = %x\n", dst_buf[0]);
	} else {
		printk("my_dma_memcpy - Error on DMA transfer\n");
	}

	dmaengine_terminate_all(chan);
free:
	dma_free_coherent(chan->device->dev, 1024, src_buf, src_addr);
	dma_free_coherent(chan->device->dev, 1024, dst_buf, dst_addr);

	dma_release_channel(chan);
	return 0;
}

/**
 * @brief This function is called, when the module is removed from the kernel
 */
static void __exit my_exit(void) {
	printk("Goodbye, Kernel\n");
}

module_init(my_init);
module_exit(my_exit);

Linux_Driver_Tutorial/30_dma_memcpy at main · Johannes4Linux/Linux_Driver_Tutorial

Let's code a Linux Driver - 30 DMA (Direct Memory Access) Memcopy - YouTube

Asynchronously DMA

This happens, for example, with data acquisition devices that go on pushing data even if nobody is reading them. In this case, the driver should maintain a buffer so that a subsequent read call will return all the accumulated data to user space. The steps involved in this kind of transfer are slightly different:

  • The hardware raises an interrupt to announce that new data has arrived.
  • The interrupt handler allocates a buffer and tells the hardware where to transfer its data.
  • The peripheral device writes the data to the buffer and raises another interrupt when it’s done.
  • The handler dispatches the new data, wakes any relevant process, and takes care of housekeeping.

A variant of the asynchronous approach is often seen with network cards. These cards often expect to see a circular buffer (often called a DMA ring buffer) established in memory shared with the processor; each incoming packet is placed in the next available buffer in the ring, and an interrupt is signaled. The driver then passes the network packets to the rest of the kernel, and places a new DMA buffer in the ring.

Asynchronously DMA 是不是就是 first-party DMA 的 use case

Direct Memory Access and Bus Mastering - Linux Device Drivers, Second Edition [Book]

在等待 DMA 完成的过程中,CPU 可以运行吗?

答案是可以运行的,这里有关于这个问题的讨论:microprocessor - Does a CPU completely freeze when using a DMA? - Electrical Engineering Stack Exchange

  • Operations like this do not delay the processor, but can be rescheduled to handle other tasks. DMA(Direct Memory Access) Wiki - FPGAkey,reschedule 的原因是 read 一般来说是同步的,用户希望 read 完之后再执行下一行代码,所以当前进程没法运行了,需要调度其他进程来运行。

注意在数据传输过程中,CPU 是可以进行其他操作的:在DMA控制传输的同时cpu真的还可以运行其他程序吗? - 其他 - 恩智浦技术社区 我觉得是因为 DMA 是通过周期窃取的方式来获得总线的控制权的。所以在 DMA 的过程中 CPU 也会有机会 master the bus 然后做自己的事。

这里也提到了,会把 process 置为 sleep:Direct Memory Access and Bus Mastering - Linux Device Drivers, Second Edition [Book]

DMA 是 CPU 发起的,还是 Device 发起的?

有两种说法,一种说 DMA 一般都是 Device 发起的:DMA(直接内存访问)实现数据传输的流程详解-百度开发者中心

但是维基百科说是 CPU 先发起的:With DMA, the CPU first initiates the transfer, then it does other operations while the transfer is in progress。

到底是谁发起的呢?

The CPU initializes the DMA controller with:

  • A count of the number of words to transfer, and
  • The memory address to use.

The CPU then commands the peripheral device to initiate a data transfer. The DMA controller then provides addresses and read/write control lines to the system memory. Each time a byte of data is ready to be transferred between the peripheral device and memory, the DMAC increments its internal address register until the full block of data is transferred. 这样 DMAC 就知道可以通知 CPU 了。

所以从上面来看,其实是 CPU 先发起的,只不过是 CPU(driver)让 device 开始写了。什么叫 device 开始写呢?这个数据的传输动作是 device 主动触发的还是 DMA 主动触发的?

Driver/DMA 相关的书可以看这个:,ch15.13676

但是要注意的是,上面应该对应的是 third-party DMA 的情况,device 自己也可以主动发起 DMA 的,这叫做 first-party DMA^,也叫做 bus mastering,PCI DMA 就是这种情况的设计。

DMA 安全吗,如果 Malicious Device 往不该写的内存写怎么办?/ DMA Attack

This is a security concern known as a DMA attack.

IOMMU 可以 mitigate 这种 attack。

Bus Mastering

Bus mastering 是 DMA 的一种,bus mastering 其实是 first-party DMA^。

PCI 的 DMA 方式其实就是 bus mastering,也就是 first-party DMA:

  • Bus Mastering on PCI: The PCI bus architecture itself supports bus mastering capabilities. This means devices connected to the PCI bus can be granted permission (Bus Master Enable or BME) to directly access the system bus and memory for data transfers.
  • Device Initiated Transfers: In PCI DMA, the device itself initiates the DMA transfer process. It sends a request to the DMA controller, specifying details like source and destination addresses, transfer size, and utilizes the PCI bus for data movement.
  • DMA Controller Involvement: While the device takes initiative, the DMA controller is still involved. The controller manages arbitration on the bus (ensuring no conflicts), performs error checking, and handles some aspects of the transfer process.

Most modern bus architectures, such as PCI, allow multiple devices to bus master because it significantly improves performance for general-purpose operating systems.

While bus mastering theoretically allows one peripheral device to directly communicate with another, in practice almost all peripherals master the bus exclusively to perform DMA to main memory.

也就是说之前和 CPU 抢总线这件事是 DMAC 来做的,现在我们不需要 DMAC 了,Device 自己就可以抢了总线然后往内存里写数据了。为什么之前 DMAC 抢总线呢?他抢了总线怎么保证 Device 在抢到总线的这段时间正好在收发数据呢? 答案是 DMAC 其实和 Device 有通信的接口协议的。US6701405B1 - DMA handshake protocol - Google Patents 他会和 Device 进行通信来让 Device 发送或者接收数据。有了 bus mastering 之后,Device 可以自己把总线占了然后自己往内存里写数据,不需要 DMAC 代替其和 CPU 抢总线了。

我觉得这个和异步 DMA 是正交的概念。异步 DMA 不是通过 CPU 来发起的(比如 read()),是先数据到了才通知 ISR 然后设置 DMA region 来传输的。但是后面 DMA 传输的过程应该可以是 first-party 也可以是 third-party 形式的。

Some real-time operating systems prohibit peripherals from becoming bus masters, because the scheduler can no longer arbitrate for the bus and hence cannot provide deterministic latency.

Direct Memory Access and Bus Mastering - Linux Device Drivers, Second Edition [Book]

First-party DMA & Third-party DMA & PCI DMA

Third-party DMA 才是标准 DMA。使用 DMA Controller。

First-party DMA 也就是 bus mastering 也就是 PCI DMA 所采取的方式。

Standard DMA, also called third-party DMA, uses a DMA controller. (first-party 可能也有 controller,但是只负责仲裁)

Bus mastering is the capability of devices on the PCI bus (other than the system chipset, of course) to take control of the bus and perform transfers directly. PCI supports full device bus mastering, and provides bus arbitration facilities through the system chipset.

First-party DMA, the device drives its own DMA bus cycles using a channel from the system's DMA engine. The ddi_dmae_1stparty(9F) function is used to configure this channel in a cascade mode so that the DMA engine will not interfere with the transfer. Modern IDE/ATA hard disks use first-party DMA transfers. The term "first party" means that the peripheral device itself does the work of transferring data to and from memory, with no external DMAC involved. This is also called bus mastering, because when such transfers are occurring the device becomes the "master of the bus".

Direct Memory Access (DMA) Modes and Bus Mastering DMA

Third-party DMA utilizes a system DMA engine resident on the main system board, which has several DMA channels available for use by devices. The device relies on the system's DMA engine to perform the data transfers between the device and memory. The driver uses DMA engine routines (see ddi_dmae(9F)) to initialize and program the DMA engine. For each DMA data transfer, the driver programs the DMAC and then gives the device a command to initiate the transfer in cooperation with that DMAC.

更深的理解请看 Bus mastering^。

What is DMA Engine

好像并没有 DMA Engine 这个称呼。我觉得应该看语境:

  • 如果指的是 hardware,那么应该就是 Hardware DMA Controller;
  • 如果指的是 software,那么应该就是 Software DMA Engine in Kernel,kernel 里有一个头文件就是 drivers/dma/dmaengine.h

linux - What is the difference between DMA-Engine and DMA-Controller? - Stack Overflow

What is a DMA channel

DMA channel 是 DMA controller 里面的一部分。

DMA channels are virtual pathways within a DMA controller that manage Direct Memory Access (DMA) transfers between devices and system memory.

A single system typically has multiple devices capable of performing DMA transfers. DMA channels provide a way to manage these concurrent requests and prevent conflicts.

Each DMA channel on a DMAC typically has its own set of registers for configuration and control.

Bus address

For example, if a PCI device has a BAR, the kernel reads the bus address (A) from the BAR and converts it to a CPU physical address (B). The address B is stored in a struct resource and usually exposed via /proc/iomem. When a driver claims a device, it typically uses ioremap() to map physical address B at a virtual address (C). It can then use, e.g., ioread32(C), to access the device registers at bus address A.

As a matter of fact, the situation is slightly more complicated than that. DMA-based hardware uses bus, rather than physical, addresses. Although ISA and PCI addresses are simply physical addresses on the PC, this is not true for every platform.

Aka. DMA address.

Most of the 64bit platforms have special hardware that translates bus addresses (DMA addresses) …

Kernel source: Documentation/DMA-mapping.txt

DMA Mapping

DMA mapping is a conversion from virtual addressed memory to a memory which is DMA-able on physical addresses (actually bus addresses).

Where is the DMA hardware?

What's the difference between "driver" and "the device"

For an example, GPU has its own computing unit and it may also want to access

https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt

片内 DMA

DMA Controller

也是一种外设。

可以将其视为一种能够通过一组专用总线将内部和外部存储器与每个具有 DMA 能力的外设连接起来的控制器。

什么是仲裁器

什么是 DMA 通道

每个通道对应不同的外设的 DMA 请求。

虽然每个通道可以接收多个外设的请求,但是同一时间只能有一个有效,不能同时接收多个。

DMA Direction

/**
 * enum dma_transfer_direction - dma transfer mode and direction indicator
 * @DMA_MEM_TO_MEM: Async/Memcpy mode
 * @DMA_MEM_TO_DEV: Slave mode & From Memory to Device
 * @DMA_DEV_TO_MEM: Slave mode & From Device to Memory
 * @DMA_DEV_TO_DEV: Slave mode & From Device to Device
 */
enum dma_transfer_direction {
	DMA_MEM_TO_MEM,
	DMA_MEM_TO_DEV,
	DMA_DEV_TO_MEM,
	DMA_DEV_TO_DEV,
	DMA_TRANS_NONE,
};