内存 in Linux
0voice/kernel_memory_management: 总结整理linux内核的内存管理的资料,包含论文,文章,视频,以及应用程序的内存泄露,内存池相关
Linux 里的程序都是调用 malloc()
来申请内存,如果内存不足,直接 malloc()
返回失败就可以,为什么还要去杀死正在运行的进程呢?Linux 允许进程申请超过实际物理内存上限的内存。因为 malloc()
申请的是内存的虚拟地址,系统只是给了程序一个地址范围,由于没有写入数据,所以程序并没有得到真正的物理内存。物理内存只有程序真的往这个地址写入数据的时候,才会分配给程序。
内核态申请的用于内核自己使用的内存统计在哪里?
算用户态进程的内存占用吗,应该不算。比如 VMCS,算在这个内核内存占用里面,而不是单个进程里面。
如何统计内核占用的内存:内存泄漏?从用户态跟踪到内核去 - 腾讯云开发者社区-腾讯云
Linux 内存回收
当系统内存短缺的情况下仍去申请内存,可能会触发系统对内存的回收。
也就是系统释放掉可以回收的内存,比如缓存和缓冲区,就属于可回收内存。当这些都释放掉后,如果还是不够,那么就需要 kill 掉一些容器了。
kernel.org/doc/Documentation/cgroup-v1/memory.txt
kswapd
一个内核线程。每个 NUMA 内存节点会有一个 kswapd 进程,为了衡量内存的使用情况,kswapd 定义了三个内存阈值(watermark,也称为水位),分别是:
页最小阈值(pages_min)、页低阈值(pages_low)和页高阈值(pages_high)。
- 剩余内存小于
pages_min
,说明进程可用内存都耗尽了,只有内核才可以分配内存。 - 剩余内存落在
pages_min
和pages_low
中间,说明内存压力比较大,剩余内存不多了。这时 kswapd0 会执行内存回收,直到剩余内存大于pages_high
为止。 - 剩余内存落在
pages_low
和pages_high
中间,说明内存有一定压力,但还可以满足新内存请求。 - 剩余内存大于
pages_high
,说明剩余内存比较多,没有内存压力。
/proc/sys/vm/swappiness
选项,用来调整使用 Swap 的积极程度。 的范围是 0-100,数值越大,越积极使用 Swap,也就是更倾向于回收匿名页;数值越小,越消极使用 Swap,也就是更倾向于回收文件页。
kswap 进程虽然是系统启动时就会创建,但是大多数时候它处于睡眠状态,只有在进程由于内存不足导致分配内存失败时会被唤醒,从而回收内存,供进程使用。
__alloc_pages_slowpath
wake_all_kswapds // 路径 1,进入慢分配路径会先唤醒 kswap 进程
wakeup_kswapd
wake_up_interruptible(&pgdat->kswapd_wait)
__alloc_pages_direct_reclaim //路径 1 失败后,进行直接内存回收
__perform_reclaim
try_to_free_pages
throttle_direct_reclaim
allow_direct_reclaim
wake_up_interruptible(&pgdat->kswapd_wait)
kswapd进程工作原理(一)——初始化及触发-CSDN博客
It is process agnostic, it is only interested in what pages are accessed and when (it is more complex than this of course but to keep things simple we may as well view it this way).
So the real question is "what processes have the greatest burden on memory that are causing kswapd to need to page all the time".
proc/meminfo
内存回收类型
申请内存:
- 如果发现超过了 cgroup 能申请的内存上限,那么会挑选这个 cgroup 里面的一个或者多个进程杀死;
- 如果整个系统可用内存不足(一般是因为内存超卖),会触发直接内存回收,如果还不足,就会被 OOM killer 选择一个进程杀死(不一定是本进程)。
主要区分点是触发时机和处理页面类型。
- 快速内存回收:
- 直接内存回收(同步,阻塞进程):系统 CPU 利用率会升高,系统负荷会增大,因此要尽量避免直接内存回收。
- kswapd 内存回收(异步,后台,不阻塞进程)。
它们的共同点是均使用 LRU 链表等数据结构作为回收用的容器,并且回收流程相似。
- 对于匿名页(应用程序动态分配的堆内存),可以放入到 swap 分区;
- 对于文件页,脏页可以进行写回操作,非脏页直接释放即可。
直接内存回收 memory direct reclaim
OOM Killer
会先走直接内存回收,如果发现回收完了还是不够的话,会发生 OOM Kill。
OOM Killer 在 Linux 系统里如果内存不足时,会杀死一个正在运行的进程来释放一些内存。
Linux 内存统计
Cache, shmem
VSS/RSS/PSS/USS
进程占用内存
进程内存占用分内核态用户态吗?
OOM 问题排查思路
首先查看事发期间的 journal dmesg(有可能已经被清理了):
cat /var/log/messages
# 查看 coredump 保存位置
cat /proc/sys/kernel/core_pattern
journalctl -k --since "2024-12-03 23:28:11" --until "2024-12-04 00:30:00"
进程 OOM 日志
一个非常直观的方式来查看:
sudo cat kern-20241208 | grep reaped
一个完整的日志如下所示。分成以下几部分:
- call trace。表示造成 OOM 的链路;
- 内存状态;
- cgroup 下每一个进程的状态以及其 oom score;
- 最后选择了哪些进程 kill 掉:
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=54966a6c9d0c88bbc8d9eaa6426451653613cece11693c665330ebc5b4d02719,mems_allowed=0-1,oom_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83,task_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83/54966a6c9d0c88bbc8d9eaa6426451653613cece11693c665330ebc5b4d02719,task=java,pid=161175,uid=2405
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: oom_reaper: reaped process 161175 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
下面这段日志连续 kill 了三个进程:
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: argusagent invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-997
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: CPU: 88 PID: 154533 Comm: argusagent Kdump: loaded Tainted: G OE K 5.10.112-005.ali5000.al8.aarch64 #1
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 1.2.M1.AL.P.157.00 07/29/2023
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: Call trace:
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: dump_backtrace+0x0/0x1e0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: show_stack+0x1c/0x24
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: dump_stack+0xcc/0x120
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: dump_header+0x3c/0x44
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: dump_memcg_header+0x20/0x58
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: oom_kill_process+0x26c/0x274
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: out_of_memory+0x100/0x3d0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: mem_cgroup_out_of_memory+0x128/0x140
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: try_charge+0x544/0x5c0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: mem_cgroup_charge+0x80/0x27c
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: __add_to_page_cache_locked+0x290/0x4c0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: add_to_page_cache_lru+0x58/0xf4
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: pagecache_get_page+0x240/0x3f0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: filemap_fault+0x544/0x724
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: ext4_filemap_fault+0x38/0x980
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: __do_fault+0x40/0x1f4
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: do_read_fault+0x64/0x36c
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: do_fault+0x8c/0x180
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: handle_pte_fault+0x84/0x234
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: __handle_mm_fault+0x1d8/0x390
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: handle_mm_fault+0xa0/0x200
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: do_page_fault+0x16c/0x3c0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: do_translation_fault+0xac/0xc8
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: do_mem_abort+0x44/0xa0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: el0_ia+0x68/0xdc
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: el0_sync_handler+0x90/0xb0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: el0_sync+0x148/0x180
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: memory: usage 8388392kB, limit 8388608kB, failcnt 82406
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: memory+swap: usage 8388392kB, limit 9007199254740988kB, failcnt 0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: Memory cgroup stats for /kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83:
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: anon 8585138176
file 11624448
kernel_stack 0
percpu 0
sock 0
shmem 0
file_mapped 3784704
file_dirty 6758400
file_writeback 0
anon_thp 1730150400
file_thp 0
shmem_thp 0
inactive_anon 8589103104
active_anon 0
inactive_file 0
active_file 0
unevictable 0
slab_reclaimable 0
slab_unreclaimable 0
slab 0
workingset_refault_anon 0
workingset_refault_file 2321768
workingset_activate_anon 0
workingset_activate_file 261457
workingset_restore_anon 0
workingset_restore_file 20474
workingset_nodereclaim 0
pgfault 10609479070
pgmajfault 55256
pgrefill 4491117
pgscan 59669019
pgsteal 52371733
pgactivate 796306058
pgdeactivate 3688371
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 6925
thp_collapse_alloc 1735
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: Tasks state (memory values in pages):
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 134996] 0 134996 201 1 32768 0 -998 pause
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 135440] 0 135440 920 313 36864 0 -997 ops_container_i
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 151151] 0 151151 173921 1797 163840 0 -997 logagent
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 151753] 0 151753 81791 25909 409600 0 -997 logagent-collec
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 151759] 0 151759 21487 1213 118784 0 -997 logagent-collec
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 170234] 0 170234 773586 2704 528384 0 -997 staragentd
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 7252] 1000 7252 1032130 70899 1024000 0 -997 java
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 169078] 0 169078 187481 5126 184320 0 -997 dfget
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 103530] 0 103530 571 88 36864 0 -997 sleep
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 113209] 0 113209 27441 703 65536 0 -997 su
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 113211] 0 113211 26770 643 57344 0 -997 cli.sh
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 113230] 0 113230 26468 89 53248 0 -997 sleep
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 145463] 0 145463 5220 3008 73728 0 -997 systemd
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 146939] 0 146939 4450 918 65536 0 -1000 sshd
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 147305] 0 147305 26993 538 53248 0 -997 crond
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 147509] 0 147509 842691 3611 479232 0 -997 staragentd
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 150780] 0 150780 41654 1390 360448 0 -997 rasp-daemon
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 154033] 0 154033 612567 1852 524288 0 -997 argusagent
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 155097] 0 155097 7419 1203 65536 0 -997 ilogtail
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 155098] 0 155098 86251 10704 315392 0 -997 ilogtail
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 161175] 2405 161175 2129386 1137647 13078528 0 -997 java
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 196134] 0 196134 35092 1638 200704 0 -997 syslog-ng
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 196516] 0 196516 5483 1005 77824 0 -997 systemd-journal
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 211042] 0 211042 287770 4936 438272 0 -997 uniagent
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 113407] 0 113407 27442 688 65536 0 -997 su
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 113415] 0 113415 26770 625 49152 0 -997 cli.sh
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 121483] 0 121483 567231 415853 3461120 0 -997 python2.7
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: [ 121604] 0 121604 439169 410811 3371008 0 -997 rpm
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=54966a6c9d0c88bbc8d9eaa6426451653613cece11693c665330ebc5b4d02719,mems_allowed=0-1,oom_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83,task_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83/54966a6c9d0c88bbc8d9eaa6426451653613cece11693c665330ebc5b4d02719,task=java,pid=161175,uid=2405
Dec 4 00:01:23 phyhost-ecs-ali033057255136.na610 kernel: oom_reaper: reaped process 161175 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: staragentd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-997
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: CPU: 79 PID: 170287 Comm: staragentd Kdump: loaded Tainted: G OE K 5.10.112-005.ali5000.al8.aarch64 #1
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 1.2.M1.AL.P.157.00 07/29/2023
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: Call trace:
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: dump_backtrace+0x0/0x1e0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: show_stack+0x1c/0x24
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: dump_stack+0xcc/0x120
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: dump_header+0x3c/0x44
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: dump_memcg_header+0x20/0x58
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: oom_kill_process+0x26c/0x274
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: out_of_memory+0x100/0x3d0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: mem_cgroup_out_of_memory+0x128/0x140
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: try_charge+0x544/0x5c0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: mem_cgroup_charge+0x80/0x27c
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: __add_to_page_cache_locked+0x290/0x4c0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: add_to_page_cache_lru+0x58/0xf4
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: pagecache_get_page+0x240/0x3f0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: filemap_fault+0x544/0x724
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: ext4_filemap_fault+0x38/0x980
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: __do_fault+0x40/0x1f4
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: do_read_fault+0x64/0x36c
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: do_fault+0x8c/0x180
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: handle_pte_fault+0x84/0x234
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: __handle_mm_fault+0x1d8/0x390
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: handle_mm_fault+0xa0/0x200
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: do_page_fault+0x16c/0x3c0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: do_translation_fault+0xac/0xc8
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: do_mem_abort+0x44/0xa0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: el0_ia+0x68/0xdc
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: el0_sync_handler+0x90/0xb0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: el0_sync+0x148/0x180
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: memory: usage 8388600kB, limit 8388608kB, failcnt 138540
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: memory+swap: usage 8388600kB, limit 9007199254740988kB, failcnt 0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: Memory cgroup stats for /kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83:
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: anon 8588054528
file 10137600
kernel_stack 0
percpu 0
sock 0
shmem 0
file_mapped 3108864
file_dirty 6893568
file_writeback 0
anon_thp 0
file_thp 0
shmem_thp 0
inactive_anon 8591228928
active_anon 0
inactive_file 180224
active_file 0
unevictable 0
slab_reclaimable 0
slab_unreclaimable 0
slab 0
workingset_refault_anon 0
workingset_refault_file 2561744
workingset_activate_anon 0
workingset_activate_file 261919
workingset_restore_anon 0
workingset_restore_file 20474
workingset_nodereclaim 0
pgfault 10610763133
pgmajfault 64595
pgrefill 4619575
pgscan 77639847
pgsteal 52617757
pgactivate 796428620
pgdeactivate 3811580
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 6925
thp_collapse_alloc 1735
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: Tasks state (memory values in pages):
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 134996] 0 134996 201 1 32768 0 -998 pause
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 135440] 0 135440 920 313 36864 0 -997 ops_container_i
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 151151] 0 151151 173921 1797 163840 0 -997 logagent
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 151753] 0 151753 85889 25934 417792 0 -997 logagent-collec
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 151759] 0 151759 21487 1379 118784 0 -997 logagent-collec
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 170234] 0 170234 773586 2678 528384 0 -997 staragentd
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 7252] 1000 7252 1032130 70899 1024000 0 -997 java
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 103530] 0 103530 571 88 36864 0 -997 sleep
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 113209] 0 113209 27441 703 65536 0 -997 su
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 113211] 0 113211 26770 643 57344 0 -997 cli.sh
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 140894] 0 140894 1286 259 45056 0 -997 python2.7
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 145463] 0 145463 5220 3015 73728 0 -997 systemd
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 146939] 0 146939 4450 884 65536 0 -1000 sshd
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 147305] 0 147305 26993 565 53248 0 -997 crond
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 147509] 0 147509 842691 3627 479232 0 -997 staragentd
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 150780] 0 150780 41654 1493 360448 0 -997 rasp-daemon
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 154033] 0 154033 649433 1751 544768 0 -997 argusagent
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 155097] 0 155097 7419 1203 65536 0 -997 ilogtail
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 155098] 0 155098 86251 11666 315392 0 -997 ilogtail
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 196134] 0 196134 35092 1643 200704 0 -997 syslog-ng
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 196516] 0 196516 5483 1031 77824 0 -997 systemd-journal
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 211042] 0 211042 287770 4937 438272 0 -997 uniagent
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 113407] 0 113407 27442 688 65536 0 -997 su
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 113415] 0 113415 26770 625 49152 0 -997 cli.sh
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 121483] 0 121483 1142190 990715 8065024 0 -997 python2.7
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: [ 121604] 0 121604 1007812 979443 7925760 0 -997 rpm
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=3f728405d7e4e3578281464d5de9c9f697c7623bf6084488e43ac18a060798be,mems_allowed=0-1,oom_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83,task_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83/54966a6c9d0c88bbc8d9eaa6426451653613cece11693c665330ebc5b4d02719,task=python2.7,pid=121483,uid=0
Dec 4 00:02:19 phyhost-ecs-ali033057255136.na610 kernel: oom_reaper: reaped process 121483 (python2.7), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: LogProcess-0 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-997
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: CPU: 118 PID: 211259 Comm: LogProcess-0 Kdump: loaded Tainted: G OE K 5.10.112-005.ali5000.al8.aarch64 #1
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 1.2.M1.AL.P.157.00 07/29/2023
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: Call trace:
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: dump_backtrace+0x0/0x1e0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: show_stack+0x1c/0x24
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: dump_stack+0xcc/0x120
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: dump_header+0x3c/0x44
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: dump_memcg_header+0x20/0x58
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: oom_kill_process+0x26c/0x274
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: out_of_memory+0x100/0x3d0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: mem_cgroup_out_of_memory+0x128/0x140
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: try_charge+0x544/0x5c0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: mem_cgroup_charge+0x80/0x27c
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: __add_to_page_cache_locked+0x290/0x4c0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: add_to_page_cache_lru+0x58/0xf4
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: pagecache_get_page+0x240/0x3f0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: filemap_fault+0x544/0x724
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: ext4_filemap_fault+0x38/0x980
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: __do_fault+0x40/0x1f4
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: do_read_fault+0x64/0x36c
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: do_fault+0x8c/0x180
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: handle_pte_fault+0x84/0x234
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: __handle_mm_fault+0x1d8/0x390
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: handle_mm_fault+0xa0/0x200
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: do_page_fault+0x16c/0x3c0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: do_translation_fault+0xac/0xc8
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: do_mem_abort+0x44/0xa0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: el0_ia+0x68/0xdc
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: el0_sync_handler+0x90/0xb0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: el0_sync+0x148/0x180
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: memory: usage 8388536kB, limit 8388608kB, failcnt 265013
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: memory+swap: usage 8388536kB, limit 9007199254740988kB, failcnt 0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: Memory cgroup stats for /kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83:
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: anon 8591028224
file 9867264
kernel_stack 0
percpu 0
sock 0
shmem 0
file_mapped 2162688
file_dirty 6217728
file_writeback 0
anon_thp 0
file_thp 0
shmem_thp 0
inactive_anon 8594743296
active_anon 0
inactive_file 0
active_file 0
unevictable 0
slab_reclaimable 0
slab_unreclaimable 0
slab 0
workingset_refault_anon 0
workingset_refault_file 2948207
workingset_activate_anon 0
workingset_activate_file 263569
workingset_restore_anon 0
workingset_restore_file 21728
workingset_nodereclaim 0
pgfault 10611964828
pgmajfault 79775
pgrefill 4850797
pgscan 107324432
pgsteal 53009265
pgactivate 796650413
pgdeactivate 4036136
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 6925
thp_collapse_alloc 1735
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: Tasks state (memory values in pages):
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 134996] 0 134996 201 1 32768 0 -998 pause
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 135440] 0 135440 920 313 36864 0 -997 ops_container_i
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 151151] 0 151151 173921 1797 163840 0 -997 logagent
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 151753] 0 151753 85889 25953 417792 0 -997 logagent-collec
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 151759] 0 151759 21487 1373 118784 0 -997 logagent-collec
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 170234] 0 170234 773586 2685 528384 0 -997 staragentd
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 7252] 1000 7252 1032130 70900 1024000 0 -997 java
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 103530] 0 103530 571 88 36864 0 -997 sleep
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 145463] 0 145463 5220 3015 73728 0 -997 systemd
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 146939] 0 146939 4450 782 65536 0 -1000 sshd
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 147305] 0 147305 26993 565 53248 0 -997 crond
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 147509] 0 147509 842691 3598 479232 0 -997 staragentd
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 150780] 0 150780 41654 1517 360448 0 -997 rasp-daemon
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 154033] 0 154033 649433 1893 544768 0 -997 argusagent
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 155097] 0 155097 7419 1203 65536 0 -997 ilogtail
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 155098] 0 155098 86251 11666 315392 0 -997 ilogtail
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 196134] 0 196134 35092 1557 200704 0 -997 syslog-ng
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 196516] 0 196516 5483 1030 77824 0 -997 systemd-journal
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 211042] 0 211042 287770 4941 438272 0 -997 uniagent
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: [ 121604] 0 121604 1997684 1969310 15863808 0 -997 rpm
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=54966a6c9d0c88bbc8d9eaa6426451653613cece11693c665330ebc5b4d02719,mems_allowed=0-1,oom_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83,task_memcg=/kubepods/podcabf8137-c743-4923-96bd-1e59c6ac4a83/54966a6c9d0c88bbc8d9eaa6426451653613cece11693c665330ebc5b4d02719,task=rpm,pid=121604,uid=0
Dec 4 00:03:49 phyhost-ecs-ali033057255136.na610 kernel: oom_reaper: reaped process 121604 (rpm), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
进程 coredump / vmcore / kdump
kdump 是用户态的一个组件,是一个 systemd 的服务。
coredump 描述的是行为,而 vmcore 表示的是结果,都是一个意思。
我们使用 kill -9
命令杀死一个进程会发生 core dump 吗?实验证明是不能的,那么什么情况会产生呢?如果我们信号均是采用默认操作,那么,以下列出几种信号,它们在发生时会产生 core dump:
默认情况下,linux 系统是不会生成 core dump 文件,可以使用命令 ulimit -c
来查看,如果输出为 0 则表示不会生成 core dump 文件。如果输出 unlimited 表示生成文件大小不受限制。
编译的时候,需要加上 -g
选项,程序崩溃的时候才会生成 core dump 文件。
coredump 文件生成格式 /proc/sys/kernel/core_pattern
。
linux 下core dump文件的生成以及错误定位_linux core dump-CSDN博客
这个是谁保存的?一般保存到哪里?
Memcg OOM Strategy / oom_score
如果一个 memcg 发生了 OOM,那么这个 memcg 里 kill 掉哪一个进程是如何决定的呢?
You can see the oom_score
of each of the processes in the /proc
filesystem under the pid directory:
cat /proc/10292/oom_score
The higher the value of oom_score
of any process, the higher is its likelihood of getting killed.
-
oom_score
和oom_score_adj
:每个进程都有一个 oom_score,表示它被杀的可能性。分数越高,该进程越有可能在 OOM 事件中被杀。系统管理员可以通过调节 oom_score_adj 来影响一个进程的 OOM 分数。这个值为一个可调的权重,设置范围是 -1000 到 1000,一个较低的值将降低被选择为 OOM 的可能。
How is the oom_score
calculated?
The calculation turns into a simple question of what percentage of the available memory is being used by the process. If the system as a whole is short of memory, then "available memory" is the sum of all RAM and swap space available to the system.
If instead, the OOM situation is caused by exhausting the memory allowed to a given cpuset/control group, then "available memory" is the total amount allocated to that control group. A similar calculation is made if limits imposed by a memory policy have been exceeded. In each case, the memory use of the process is deemed to be the sum of its resident set (the number of RAM pages it is using) and its swap usage.
如果有 500G 的可用内存空间,但是某一个进程用了 100G,相当于占用了 20%;如果一个进程所在的 memcg 只有 10G,但是只用了 3G,相当于占用了 30%,这种情况下会优先 kill 谁?
This calculation produces a percent-times-ten number as a result; a process which is using every byte of the memory available to it will have a score of 1000, while a process using no memory at all will get a score of zero. There are very few heuristic tweaks to this score, but the code does still subtract a small amount (30) from the score of root-owned processes on the notion that they are slightly more valuable than user-owned processes.
One other tweak which is applied is to add the value stored in each process's oom_score_adj
variable, which can be adjusted via /proc
. This knob allows the adjustment of each process's attractiveness to the OOM killer in user space; setting it to -1000 will disable OOM kills entirely, while setting to +1000 is the equivalent of painting a large target on the associated process.
User-space out-of-memory handling
当发生 OOM 时,userspace 也可以进行干预。