Misc ideas

Conclusion. The abbreviation for “number” can be both No or Nr.

If you want to fast setup a QEMU virtual machine and test your code, you can install Ubuntu Server, remember to disable the LVM when install, because it will lead to space shortage.

Fundamentally, the General Public License is a license which says that you have these freedoms and that you cannot take these freedoms away from anyone else.

IPI stands for interprocessor interrupt.

MCU is just a shortcut for Microcontroller.

Kernel doesn't support unused functions, it will cause error when doing make.

Interesting posts

Why is there no CR1 – and why are control registers such a mess anyway? – pagetable.com

X86_FEATURE Field, feature bit

We can see a typical definition, like bus lock:

#define X86_FEATURE_BUS_LOCK_DETECT (16*32+24) /* Bus Lock detect */

Bus lock need EAX as 7, ECX as 0. will return in the ECX 24 bit, so what's the meaning of (16*32)?

16 represents the 16th word (1 word = 4 bytes), the length of a return register.

But wait, what does the "16" mean?

It means nothing, because we just need to make the bits identical, so just define them in the order they are added.

If we want to get the bit (24 in the above), just need to move it to the right $2^5$.

If we want to get the CPUID leaf (16 in the above), just need to divide it by 32. the result indentifies a specific CPUID leaf. This is also the way code been written in the KVM:

static __always_inline u32 __feature_leaf(int x86_feature)
{
	return __feature_translate(x86_feature) / 32;
}

kvm_cpu_cap_get && kvm_cpu_cap_has

The former will return a u32, the latter will return a bool.

kvm_cpu_cap_has = !!kvm_cpu_cap_get 

为什么需要内核缓冲区?

为什么 read 和 write 需要内核缓冲区,直接读到用户缓冲区不行吗?

应用内核缓冲区的主要思想就是一次读入大量的数据放在缓冲区,需要的时候从缓冲区取得数据:

  • 提高了磁盘的 I/O 效率;
  • 优化了磁盘的写操作;
  • 需要及时的将缓冲数据写到磁盘。

mmu->permissions

The introduced patch is [PATCH v2 5/9] KVM: MMU: Optimize pte permission checks

Optimize this away by precalculating all variants and storing them in a bitmap. The bitmap is recalculated when rarely-changing variables change (cr0, cr4) and is indexed by the often-changing variables (page fault error code, pte access permissions).

IMO, This attribute serves as a cache for many possible combinations of PFEC [4:1]. From the declaration in arch/x86/include/asm/kvm_host.h:

/*
 * Bitmap; bit set = permission fault
 * Byte index: page fault error code [4:1]
 * Bit index: pte permissions in ACC_* format
 */
u8 permissions[16];

it has 16 entries, and PFEC [4:1] also have 16 possible values (I/D flag, RSVD flag, U/S, W/R), which can serve as the index to this attribute.

The type of each item is u8, what is that?

it serves as a bitmap, you can see the following code:

unsigned pfec = byte << 1;

/* Faults from writes to non-writable pages */
u8 wf = (pfec & PFERR_WRITE_MASK) ? (u8)~w : 0;
/* Faults from user mode accesses to supervisor pages */
u8 uf = (pfec & PFERR_USER_MASK) ? (u8)~u : 0;
/* Faults from fetches of non-executable pages*/
u8 ff = (pfec & PFERR_FETCH_MASK) ? (u8)~x : 0;
/* Faults from kernel mode fetches of user pages */
u8 smepf = 0;
/* Faults from kernel mode accesses of user pages */
u8 smapf = 0;

// ......

mmu->permissions[byte] = ff | uf | wf | smepf | smapf;

Why the index and the value are the same? both uses pfec?

They are similar, but not identical.

Shada

shada 是 nvim 对于 viminfo 的一个重新设计,全称为 shared data。

The advantage over viminfo is that it supports shared data among multiple editor instances.

Viminfo

Vim 使用 viminfo 选项,来定义如何保存会话(session)信息,也就是保存 Vim 的操作记录和状态信息,以用于重启 Vim 后能恢复之前的操作状态。

Saving at $HOME/.viminfo.

The viminfo file is used to store:

  • The command line history.
  • The search string history.
  • The input-line history.
  • Contents of non-empty registers.
  • Marks for several files.
  • File marks, pointing to locations in files.
  • Last search/substitute pattern (for 'n' and '&').
  • The buffer list.
  • Global variables.

You could also use a Session file. The difference is that the viminfo file does not depend on what you are working on. There normally is only one viminfo file. Session files are used to save the state of a specific editing session. You could have several Session files, one for each project you areworking on. Viminfo and Session files together can be used to effectively enter Vim and directly start working in your desired setup.

KVM git workflow

'next': contains patches destined for the next merge window; for example, if v3.4 has just been released, then 'next' will be based on the v3.5-rc1 tag (or thereabouts), and will be accepting patches targeted at the 3.6 merge window (when 3.5 is released).

'queue': holding area for patches which have not been tested yet; may be rebased; usually merged into 'next'

Since 'next' is based on an -rc1 release, it may lack upstream non-kvm fixes (or features) needed for development. To work around that, an 'auto-next' branch will be provided that is a merge of current upstream and 'next' (this is more or less the equivalent of the old 'master' branch). That branch of course will be rebased often.

Kvm-Git-Workflow - KVM

What is a git patch?

A simple patch looks like:

diff --git a/foo.c b/foo.c
index 30cfd169..8de130c2 100644
--- a/foo.c
+++ b/foo.c
@@ -1,5 +1,5 @@
 #include <string.h>

 int check (char *string) {
-    return !strcmp(string, "ok");
+    return (string != NULL) && !strcmp(string, "ok");
 }

You can see the line index 30cfd169..8de130c2 100644 indicates the base commit is 30cfd169. So patch is not just diff, it has some meta information.

A patch can often be applied to a somewhat earlier or later version of the first file than the one from which it was derived, as long as the applying program can still locate the context of the change.

Bare repository

A non-bare or default git repository has a *.git* folder, which is the backbone of the repository where all the important files for tracking the changes in the folders are stored.

A bare repository is the same as default, but no commits can be made in a bare repository.

Bare repository is essentially a .git folder with a specific folder where all the project files reside.

bare 仓库类似版本管理服务器,只存储版本管理的相关文件,不存储工作文件,因此里面没有工作目录。

What does REPL mean in debugging?

Read, Eval, Print, Loop.

also termed an interactive toplevel or language shell, like IPython.

PDF Linearization

PDF Linearization is equals to Fast Web View.

Optimizing PDFs so they can be streamed into a client application in similar fashion to Youtube videos. This helps remote, online documents open almost instantly, without having to wait minutes or hours for a large document to completely download.

gitsigns.nvim

gitsigns shows the sign column by diffing working space and index.

From man git-rev-parse;

:[<n>:]<path>, e.g. :0:README, :README

A colon, optionally followed by a stage number (0 to 3) and a colon, followed by a path, names a blob object in the index at the given path. A missing stage number (and the colon that follows it) names a stage 0 entry. During a merge, stage 1 is the common ancestor, stage 2 is the target branch’s version (typically the current branch), and stage 3 is the version from the branch which is being merged.

When resolving conflicts, gitsigns diffs the buffer against the common ancestor (:1)

What is the difference between a desktop environment and a window manager?

The window manager manages your windows. It puts the window decoration around the contents including the buttons to minimize or close. It allows resizing and moving the windows around, decides which window is on top. Some famous window managers include:

  • bspwm
  • i3wm (tiling)

A desktop environment gives you an overall user experience. It has the panels, the system menus, the starters, the status applets. It needs a window manager, of course, to manage the windows. Some famous DEs include:

  • KDE
  • GNOME
  • XFCE

Can checkout this for more information: Windowing system - Wikipedia

Instruction lifetime

What is a commit in CPU?

Commit is typically the last stage an instruction passes through in a pipelined processor.

Execute out of order, commit in order.

Why commit in order?

In-order commit is necessary for precise exceptions that can roll-back to exactly the instruction that faulted, without any instructions after that having already retired.

A paper: Deconstructing Commit

What is manifest?

Part of PWAs.

A web application manifest, provides information about a web application in a JSON text file, necessary for the web app to be downloaded and be presented to the user similarly to a native app.

PWA manifests include its name, author, icon(s), version, description, and list of all the necessary resources (among other things).

Web app manifests | MDN

Favicon generator

Favicon Generator - Text to Favicon - favicon.io

Aur and pacman?

'pacman' is the package manager program. It's used to access the official repos.

The AUR (Arch User Repository) is a collection of user written scripts that will download and install other programs for you. Usually these programs aren't in the official repos, or they're the latest git branch or whatever. These scripts aren't vetted by the official devs, only the community. They can't be installed with pacman, and you either need to install them yourself (download all the files on the AUR page, then makepkg) or use a helper program (clyde, packer, etc) which will do it all for you.

Usually the helper programs integrate the pacman functions as well, so you can use a single program to do it all

Beamer appearance cheat sheet

Beamer-appearance-cheat-sheet.pdf

How does git know a file is renamed from another file?

Briefly, given a file in revision N, a file of the same name in revision N−1 is its default ancestor. However, when there is no like-named file in revision N−1, Git searches for a file that existed only in revision N−1 and is very similar to the new file.

How does Git know that file was renamed? - Stack Overflow

How does git know it is very similar? Here is the algorithm.

Git Source Code Review

EFER

Extended Feature Enable Register (EFER) is an MSR, which is called IA32_EFER in Intel.

PCH (Platform Controller Hub)

The PCH architecture supersedes Intel's previous Hub Architecture, with its design addressing the eventual problematic performance bottleneck between the processor and the motherboard (which used two chips - a northbridge and southbridge instead).

Some northbridge functions, the memory controller and PCI-e lanes, were integrated into the CPU while the PCH took over the remaining functions in addition to the traditional roles of the southbridge.

FLAGS register

contains the current state of a CPU.

It usually reflects the result of arithmetic operations as well as information about restrictions placed on the CPU operation at the current time.

  • carry, parity, adjust, zero and sign flags;

Note: PF in FLAGS is not Page Fault, it is Parity Flag.

EFLAGS is NOT a general purpose register.

FLAGS: 16bit, EFLAGS: 32bit, RFLAGS: 64bit, just like AX, EAX and RAX.

The wider registers retain compatibility with their smaller predecessors. (e.g., EFLAGS uses some new bits not in FLAGS register (bit 16-31), and RFLAGS uses some new bits not in EFLAGS register (bit 32-63)).

Do not mix up this (FLAGS) register with EFER (Extended Feature Enable Register) register dispite they have the similar name.

What is PCI enumeration?

During the Boot-up of the system, recognizing all the PCI devices in the system is done and is known as the PCI Bus enumeration.

Overview of PCI in Linux

BTB (Branch Target Buffers)

Use current PC to index a new PC (branch)

一图胜千言:

BTB Branch Target Buffer - Georgia Tech - HPCA: Part 1 - YouTube

Special characters in jekyll

" in code blocks will be converted to quot;

" in title will be converted to quot;

" in headings will be converted to quot;

x86-64 ABI

https://gitlab.com/x86-psABIs/x86-64-ABI/

APIC

What's the difference between local APIC(LAPIC) and I/O APIC?

  • The external I/O APIC is part of Intel’s system chip set. Its primary function is to receive external interrupt events from the system and its associated I/O devices and relay them to the local APIC as interrupt messages.
  • The LAPIC receives interrupts from the processor’s interrupt pins, from internal sources and from an external I/O APIC (or other external interrupt controller). It sends these to the processor core for handling. In multiple processor (MP) systems, it sends and receives interprocessor interrupt (IPI) messages to and from other logical processors on the system bus.

LVT (Local vector table):

Ioctl is a syscall

Note that an "ioctl" is just a special kind of system call, really; so there's not a lot of need to distinguish them (at least conceptually).

Valgrind

ioctl (an abbreviation of input/output control) is a system call for device-specific input/output operations and other operations which cannot be expressed by regular system calls.

ioctl - Wikipedia

How to implement MSR passthrough?

if guest execute rdmsr, will vmx can be configured to passthrough the msr and not trigger vmexit?

MSR bitmap.

va_start, va_arg, va_end #c

处理 C 语言可变参数( in function definition)的宏。

Why some projects doesn't need compile_commands.json to autojump?

Like QEMU.

slugify/santinize String in python to url/filename friendly

https://stackoverflow.com/a/295466/18644471

Disable outlook junk email

我发现 thunderbird 的 junk 和 outlook 的 junk 是不相关的。

thunderbird 的 junk 会在 outlook junk 完之后再分类,这就会导致有一些邮件会漏掉,如果你在 thunderbird 里设置了 junk folder,那么你会发现 outlook app 里会显示有两个 junk folder,所以应该 disable outlook junk email function。

Position Independent Executable (PIE)

aka Position-independent code (PIC).

PIE are loadable at arbitrary locations in the address space of a process, and can even be dynamically loaded and used as a module by another executable program instance. The virtual addresses in a PIE object files are therefore likely to be different from the runtime addresses. On the other hand, with ordinary executables, the virtual addresses in the object file generally correspond to the runtime addresses.

PIE is commonly used for shared libraries, so that the same library code can be loaded in a location in each program address space where it does not overlap with other memory in use (for example, other shared libraries).

Setups For Debugging QEMU with GDB and DDD - Newbie wang - 博客园

Git show relationship of two commits

git merge-base finds best common ancestor(s) between two commits to use in a three-way merge.

  • A is the ancestor of B: it will show A;
  • A is the child of B: it will show B;
  • A and B share a parent: will show the parent;
  • A and B don't share a parent: I think this is impossible.

Qemu contributing process/work model #qemu

qemu doesn't have queue branch like Linux.

Git apply patch from email

thunderbird 装 ImportExportTool 插件,选择所有要导出的 patches,右键,导出为 mbox (new)。

git am export.mbox

如果附件给了 .patch,那么可以:

git am ~/notify/*.patch

Sanity test

Sanity Testing is a subset of regression testing.

It is a part of Regression Testing, where the Regression Testing focuses on a wide variety of area of the application, there Sanity Testing focuses only on certain functionalities.

Difference between Sanity Testing and Regression Testing - GeeksforGeeks

All fish shell special input functions

how to know key name in fish shell for bind

fish_key_reader command.

you may find this helpful when you want to remap keys in fish shell:

bind - handle fish key bindings — fish-shell 3.5.0 documentation

Windows text keyboard shortcuts

Text keyboard shortcuts (Windows)

How to prevent edge to auto convert to markdown style when copying an URL?

In settings, search url 复制和粘贴格式,选择 plain text

How to install emoji on RIME/Weasel?

打开小狼毫输入法设定;

点击获取更多输入方案;

输入 emoji 并回车,如果报了

  • Curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 443
  • curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to

参考 2021-02-01 nvm 解决 0curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to raw.githubu... - 简书

安装完成后,在输入法设定中重新部署一下,如果不行就勾选上明月拼音部署一下。

(forced update) In git

A "forced update" means the remote-tracking branch was recent. This happens if you fetch (or pull) after someone does a force push to the repository.

git push origin <your_branch_name> --force

if you want to force pull, just:

git reset --hard origin/master

Chipset (motherboard chipset)

A PC's chipset controls the communication between the CPU, RAM(opens in new tab), storage and other peripherals.

For an example, you can specify the chipset in qemu by the -machine option, and you can use the q35 chipset.

q35:

What is a bank in computing?

A memory bank or bank is the smallest amount of memory that can be addressed by the processor at one time.

Patch writing disciplines #patch

  • 尽量避免 this patch,要用主动的语气。

Vmlinux, vmlinuz, zImage

vm stands for "virtual memory", not "virtual machine".

z stands for compressed.

vmlinux is a format of ELF, but it still has some differences from the regular ELF files. vmlinux 和普通 elf 文件的差别 linux kernel 加载简述_os 从业人员的博客 - CSDN 博客

vmlinux is ELF, which means it cannot boot directly because it is not a binary, I suppose that's why it is named "virtual memory". We can use vmlinux to debug in a virtual machine.

vmlinuz is a compressed vmlinux, usually using zlib.

Image: the generic Linux kernel binary image file.

zImage: a compressed version of the Linux kernel image that is self-extracting.

zImage is a binary, so it is used when booting because it can be executed directly in the CPU.

linux kernel - Image vs zImage vs uImage - Stack Overflow

This can also be an reference:

Linux startup process - Wikipedia

ISA names

i386 IA-32 IA-32e AMD64 IA-64 Intel 64
== 386 == x86 is a mode, not an arch == x86-64 安腾架构 == x86-64

Note:

  • Although IA-32 equals to x86, IA-64 is not equals to x86-64, instead Intel 64 equals to x86-64.

Remote: fatal: pack exceeds maximum allowed size

# Adjust the following variables as necessary
REMOTE=origin
BRANCH=$(git rev-parse --abbrev-ref HEAD)
BATCH_SIZE=5000

# check if the branch exists on the remote
if git show-ref --quiet --verify refs/remotes/$REMOTE/$BRANCH; then
    # if so, only push the commits that are not on the remote already
    range=$REMOTE/$BRANCH..HEAD
else
    # else push all the commits
    range=HEAD
fi
# count the number of commits to push
n=$(git log --first-parent --format=format:x $range | wc -l)

# push each batch
for i in $(seq $n -$BATCH_SIZE 1); do
    # get the hash of the commit to push
    h=$(git log --first-parent --reverse --format=format:%H --skip $i -n1)
    echo "Pushing $h..."
    git push $REMOTE ${h}:refs/heads/$BRANCH
done
# push the final partial batch
git push $REMOTE HEAD:refs/heads/$BRANCH

still need to input the username and password each time, so can adjust BATCH_SIZE to a bit larger.

git - Github remote push pack size exceeded - Stack Overflow

Numa node Socket Package Die Cluster SoC SMT MC

Numa node

一个 NUMA 可以有多个 socket,NUMA 和 node 是同一个概念。一个 node 是一个分组,可以有多个 CPU,共享一些内存 I/O 资源,node 之间通过 QPI 通信。

正常情况 NUMA Node 与 CPU Socket 数量是一致的。

Socket (architecture)

A socket is a physical socket where the physical CPU capsules are placed. A normal PC only has 1 socket.

Another name: CPU slot.

A CPU socket is made of plastic, and often comes with a lever or latch, and with metal contacts for each of the pins or lands on the CPU.

一个 NUMA 可以有多个 socket,NUMA 和 node 是同一个概念。一个 node 是一个分组,可以有多个 CPU,共享一些内存 I/O 资源,node 之间通过 QPI 通信。

Tag: NUMA | 猿大白

Package

Processor package is what you get when you buy a single processor. It contains one or more dies.

Socket to Package is 1:1.

Die

Chiplet 架构导致一个 package 里的 die 可能会变多。

The wafer is cut (diced) into many pieces, each containing one copy of the circuit. Each of these pieces is called a die.

Cluster

Didn't see it in x86, its may be an ARM concept. 在内核里面,所有 share L2 cache 的 core 属于一个 cluster。

MC

Kernel 里的叫法,可能当时起这个名字的原因是觉得超线程不只会有两个 core,可能会有多个线程,所以叫做 multi-core。没想到后面 SRF 竟然可能把这个取消掉了。

在内核里面,所有 share L3 cache 的 core 属于一个 MC。

SMT

就是超线程状态下的一个线程。

SoC

integrates all or most components of a computer or other electronic system.

SoCs are in contrast to the common traditional motherboard-based PC architecture, which separates components based on function and connects them through a central interfacing circuit board.

Whereas a motherboard houses and connects detachable or replaceable components, SoCs integrate all of these components into a single integrated circuit.

ARM-based:

System on a chip - Wikipedia

How does CPUID work?

CPUID implicitly uses the EAX (32bit) register to determine the main category (CPUID leaf) of information returned.

Some leaves also have sub-leaves, which are selected via the ECX register before calling CPUID. So, EAX is leaf, ECX is sub-leaf.

CPUID is unpriveleged, which means it is useable by userspace and doesn’t trap to supervisor mode.

In summary, the input is 2 registers (EAX, ECX), and the output is 4 registers (EAX, EBX, ECX, EDX).

CPUID Leaf EAX = 0: Highest Function Parameter and Manufacturer ID

When EAX = 0, it will return a 12-character ASCII string stored in EBX, EDX, ECX, you can count in "GenuineIntel" how many characters there are:)

CPUID Leaf EAX=7, ECX=0: Extended Features

Bus lock detection is here.

You can see more in CPUID - Wikipedia

Gdb

GDB operates on executable files which are binary files produced by compilation process.

gdb --args executablename arg1 arg2 arg3

breakpoint

  • b <function_name>;
  • b <linenr>: add breakpoint at current file;
  • b <file>:<linenr> add breakpoint at the specified position

next: process oneline, bypass function call.

step: process oneline, jump into function call.

print: print a variable.

examine/x: print a memory address value (pointer).

info:

  • info locals: show all local variables.
  • info breakpoints: show all breakpoints.

continue: gdb will run until your program ends, your program crashes, or gdb encounters a breakpoint.

What is a .gdbinit?

.gdbinit files can be used to always execute commands when you run GDB in a particular directory.

some interesting projects based on .gdbinit:

cyrus-and/gdb-dashboard: Modular visual interface for GDB in Python

Gdb launch a shell script

gdb --args bash <script>

但是这样会掩盖所有 thread 的内容,我们只能看到 bash 这一个 thread。建议还是先执行脚本启动那一个 binary,然后直接 attach 到一个 process 上:

gdb --pid <pid>

List all threads stack using gdb

thread apply all bt.

Check information of a thread: info thread.