Misc

QEMU release planning

Planning - QEMU

QEMU thread model / type

QEMU has several different types of threads:

  • vCPU threads that execute guest code and perform device emulation synchronously with respect to the vCPU.
  • The main loop that runs the event loops (yes, there is more than one!) used by many QEMU components.
  • IOThreads that run event loops for device emulation concurrently with vCPUs and "out-of-band" QMP monitor commands.

注意 main loop thread 和 IOThread 是不同的,可以看下面这篇文章的解释。

Stefan Hajnoczi: QEMU Internals: Event loops

Qemu's implementation of common date structures

qemu has its own implementation of

  • singly-linked list;
  • list;
  • simple queue;
  • tail queue.

they are in file include/qemu/queue.h

Function name with or without prefix qemu_

Wrapped version of standard library or GLib functions use a qemu_ prefix to alert readers that they are seeing a wrapped version, for example qemu_strtol or qemu_mutex_lock. Other utility functions that are widely called from across the codebase should not have any prefix, for example pstrcpy or bit manipulation functions such as find_first_bit.

Struct naming convention

QemuObject or QEMUObject or QObject.

hw/ Folder?

包含了所有支持的硬件设备。

Relationship with qdev

qdev is rebuilt on top of QOM (2011), so maybe there is qdev first, then QOM, then rebuilt qdev based on QOM.

Their difference can be found in

Functions

cpu_x86_cpuid

Input:

  • env (The processed information based on CPU Model. User can specify cpu model in the command line, then the value will be expanded and filtered into the env->features structure)
  • index (CPUID's leaf, EAX)
  • count (CPUID's subleaf, ECX);

Output: CPUID word value exposed to the guest.

Data Structures

include/qemu/typedefs.h has almost all the data structures in QEMU, you can check there.

QEMUOption

typedef struct QEMUOption {
    const char *name; // valid option name with out the dash, such as "machine"
    int flags; // some flags, including if this option need a value? e.g., "--cpu" need a value <number> but "-enable-kvm" needn't
    int index; // order in the array qemu_options, also can be seen in the build/qemu-options.def
    uint32_t arch_mask;
} QEMUOption;

QDict


CPUX86State

In target/i386/cpu.h.

VMState

Most device data can be described using the VMSTATE macros (mostly defined in include/migration/vmstate.h).

X86CPUDefinition (aka. CPU Model)

target/i386/cpu.c

A global list of this struct builtin_x86_defs[] is used to define CPU models such as SPR.

Globals

qemu_options

a global variable of type QEMUOption[], will be initialized on start with all the avalable options such as:

  • h
  • help
  • machine
  • m
  • cpu
  • device
  • drive

Processes

How does QEMU create a KVM vcpu?

x86_cpu_realizefn
	qemu_init_vcpu
		create_vcpu_thread (kvm_start_vcpu_thread)
			kvm_vcpu_thread_fn
				kvm_init_vcpu
					kvm_arch_init_vcpu

Event Loop

Why using the event loop: The application can appear to do multiple things at once without multithreading because it switches between handling different event sources.

The most important event sources in QEMU are:

  • File descriptors such as sockets and character devices.
  • Event notifiers (implemented as eventfds on Linux).
  • Timers for delayed function execution.
  • Bottom-halves (BHs) for invoking a function in another thread or deferring a function call to avoid reentrancy.

QEMU uses 2 types event loop implementations:

  • its native AioContext event loop and,
  • glib's GMainContext.
Thread type glib's event loop num AioContext event loop num
Main loop 1 2
IOThreads 1 1

How to combine AioContext and GMainContext?

QEMU components can use any of these event loop APIs and the main loop combines them all into a single event loop function os_host_main_loop_wait() that calls qemu_poll_ns() to wait for event sources. This makes it possible to combine glib-based code with code using the native QEMU AioContext APIs.

Can we use event loop and coroutines both in a single QEMU thread?

如果 event loop 的 event callback 函数里有这种 block 的操作,比如:

/* 3-step process using coroutines */
void coroutine_fn say_hello(void)
{
    const char *name;
    // 等待用户输入名字...
    co_send("Hi, what's your name? ");
    name = co_read_line();
    co_send("Hello, %s\n", name);
}

那么整个 event loop 就会被 block 住。

Coroutines make it possible to write sequential code that is actually executed across multiple iterations of the event loop. This is useful for code that needs to perform blocking I/O and would quickly become messy if split into a chain of callback functions.

If the coroutine needs to wait for an event such as I/O completion or user input, it calls qemu_coroutine_yield().

如果一个 event 的 callback 是一个 coroutine,如何保证这个 coroutine 在 yield 之后还能重新跑?

ram_load_precopy
    if ((i & 32767) == 0 && qemu_in_coroutine()) {
        aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
        qemu_coroutine_yield();
    }
    i++;

可以看到我们不仅仅通过 qemu_coroutine_yield() yield 出去了,我们在之前还调用了 aio_co_schedule()^,这个会把这个协程设计成 BH 也就是可以让其在下一个 iteration 触发,这样就不用我们手动再去 enter 了。

Stefan Hajnoczi: Coroutines in QEMU: The basics

What AioContext supersede GMainContext?

  • AioContext event sources can have a polling function that detects events without syscalls. This allows the event loop to avoid block syscalls that might lead the kernel scheduler to yield the thread.
  • O(1) time complexity with respect to the number of monitored file descriptors. i.e., Using epoll-like to replace poll.
  • Nanosecond timers. glib's event loop only has millisecond timers, which is not sufficient for emulating hardware timers.

Stefan Hajnoczi: QEMU Internals: Event loops

main_loop_wait() QEMU

QEMU's main event loop is main_loop_wait(). 这个函数每次只执行一个 iteration,所有我们可以看到这个函数往往都是在 while 循环中被调用的。

main
    qemu_main
        qemu_default_main
            qemu_main_loop
                while (!main_loop_should_exit(&status))
                    main_loop_wait(false);
                        os_host_main_loop_wait

os_host_main_loop_wait() QEMU

static int os_host_main_loop_wait(int64_t timeout)
{
    GMainContext *context = g_main_context_default();
    // 拿到 thread default context 的使用权
    // 这样就可以从这个 context 里面的 event source 里面获得 event 了
    g_main_context_acquire(context);
    glib_pollfds_fill(&timeout);
    //...
    ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, timeout);
    //...
    glib_pollfds_poll();
    // 释放这样别的线程就可以用了。
    g_main_context_release(context);
}