本文主要是介绍问题描述:当前NT98520A平台在跑业务时频繁出现kswapd0线程升高导致CPU占用率100%,设备卡死的现象,设备在比较卡的情况下进行操作容易产生oom,且oom时设备可用内存高于水位线。,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
问题描述:当前NT98520A
平台在跑业务时频繁出现kswapd0
线程升高导致CPU
占用率100%
,设备卡死的现象,设备在比较卡的情况下进行操作容易产生oom
,且oom
时设备可用内存高于水位线。
初步分析:该线程应该是在内存回收时候才会升高的,即在内存低于low回收点时才会开始回收,回收到超过high值之后停止回收。收集了几次信息发现内存剩余还有1314M时该线程已经有20%40%的CPU占用率。当前设备的zoneinfo信息水位线配置分别是4M,low是5M,high是6M。
请帮忙分析出现该现象的原因以及优化方向,谢谢!
相关log:
1-1、zoneinfo信息
[17:31:00:260]Node 0, zone Normal
[17:31:00:260] per-node stats
[17:31:00:260] nr_inactive_anon 23063
[17:31:00:261] nr_active_anon 25967
[17:31:00:266] nr_inactive_file 216
[17:31:00:274] nr_active_file 121
[17:31:00:327] nr_unevictable 5409
[17:31:00:327] nr_slab_reclaimable 2793
[17:31:00:328] nr_slab_unreclaimable 3814
[17:31:00:329] nr_isolated_anon 0
[17:31:00:362] nr_isolated_file 11
[17:31:00:362] workingset_refault 2238276
[17:31:00:362] workingset_activate 562144
[17:31:00:362] workingset_nodereclaim 1680
[17:31:00:363] nr_anon_pages 24792
[17:31:00:363] nr_mapped 4379
[17:31:00:363] nr_file_pages 29999
[17:31:00:363] nr_dirty 0
[17:31:00:363] nr_writeback 0
[17:31:00:363] nr_writeback_temp 0
[17:31:00:365] nr_shmem 24241
[17:31:00:365] nr_shmem_hugepages 0
[17:31:00:365] nr_shmem_pmdmapped 0
[17:31:00:366] nr_anon_transparent_hugepages 0
[17:31:00:366] nr_unstable 0
[17:31:00:366] nr_vmscan_write 252
[17:31:00:366] nr_vmscan_immediate_reclaim 446
[17:31:00:367] nr_dirtied 6635
[17:31:00:368] nr_written 6634
[17:31:00:368] pages free 3315
[17:31:00:370] min 1024
[17:31:00:388] low 1280
[17:31:00:388] high 1536
[17:31:00:389] spanned 75776
[17:31:00:407] present 75776
[17:31:00:407] managed 69813
[17:31:00:407] protection: (0, 0, 0)
[17:31:00:408] nr_free_pages 3315
[17:31:00:413] nr_zone_inactive_anon 23063
[17:31:00:426] nr_zone_active_anon 25988
[17:31:00:426] nr_zone_inactive_file 201
[17:31:00:426] nr_zone_active_file 178
[17:31:00:437] nr_zone_unevictable 5409
[17:31:00:437] nr_zone_write_pending 0
[17:31:00:437] nr_mlock 0
[17:31:00:437] nr_page_table_pages 344
[17:31:00:438] nr_kernel_stack 1768
[17:31:00:438] nr_bounce 0
[17:31:00:438] nr_free_cma 1018
[17:31:00:438] pagesets
[17:31:00:438] cpu: 0
[17:31:00:438] count: 11
[17:31:00:439] high: 90
[17:31:00:439] batch: 15
[17:31:00:439] vm stats threshold: 125
[17:31:00:439] node_unreclaimable: 0
[17:31:00:439] start_pfn: 0
[17:31:00:439]Node 0, zone HighMem
[17:31:00:439] pages free 0
[17:31:00:439] min 32
[17:31:00:439] low 32
[17:31:00:439] high 32
[17:31:00:439] spanned 0
[17:31:00:439] present 0
[17:31:00:440] managed 0
[17:31:00:441] protection: (0, 0, 0)
[17:31:00:446]Node 0, zone Movable
[17:31:00:446] pages free 0
[17:31:00:446] min 32
[17:31:00:446] low 32
[17:31:00:446] high 32
[17:31:00:446] spanned 0
[17:31:00:446] present 0
[17:31:00:446] managed 0
[17:31:00:447] protection: (0, 0, 0)
1-2、cma_info信息
[root@dvrdvs app] # cat /proc/fmem/cma_info
================= NVT CMA INFO =================Area:0 Name: nvt_cma Total size: 4194304physical address: 0x00400000@0x05000000 Device_Name: , Type: NVT_FMEM_ALLOC_CACHE, Physical addr:0x05000000, Virtual addr: 0x85000000, size: 4096Device_Name: , Type: NVT_FMEM_ALLOC_CACHE, Physical addr:0x05001000, Virtual addr: 0x85001000, size: 20480
CMA managed >> Used/Total size: 24576(bytes)/4194304(bytes), 6(pages)
Fmem managed >> Used/Total size: 24576(bytes)/4194304(bytes)
2、kswapd0飙升时top信息
[17:00:53:076]Mem: 265052K used, 14200K free, 96872K shrd, 84K buff, 119656K cached(剩余还有14M,但kwapd0已经在回收内存了)
[17:00:53:142]CPU: 6.6% usr 79.3% sys 0.0% nic 0.0% idle 13.0% io 0.0% irq 0.9% sirq
[17:00:53:169]Load average: 47.97 28.16 11.92 2/216 916
[17:00:53:189] PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
[17:00:53:195] 710 1 root S 810m296.9 0 25.2 [hicore]
[17:00:53:195] 45 2 root RW 0 0.0 0 21.8 [kswapd0]
[17:00:53:198] 6 2 root DW< 0 0.0 0 10.8 [kworker/0:0H+kb]
[17:00:53:198] 803 2 root DW 0 0.0 0 7.5 [kdf_ipp_tsk]
[17:00:53:202] 712 709 root S 175m 64.4 0 2.8 [uicore]
[17:00:53:202] 810 2 root DW 0 0.0 0 2.1 [isf_vext_tsk]
[17:00:53:203] 798 2 root DW 0 0.0 0 1.5 [ctl_sie_isp_tsk]
[17:00:53:203] 801 2 root DW 0 0.0 0 1.5 [ctl_ipp_isp_tsk]
[17:00:53:204] 503 2 root DW 0 0.0 0 1.5 [iq_tsk]
[17:00:53:205] 711 1 root S 137m 50.4 0 1.4 [Attendance]
[17:00:53:206] 320 2 root DW 0 0.0 0 1.4 [kdrv_ise_proc_t]
[17:00:53:206] 502 2 root DW 0 0.0 0 1.4 [iq_tsk]
[17:00:53:206] 799 2 root DW 0 0.0 0 1.2 [ctl_ipp_buf_tsk]
[17:00:53:207] 916 2 root IW< 0 0.0 0 1.0 [kworker/0:3H-mm]
[17:00:53:207] 915 168 root R 2488 0.8 0 0.8 [top]
[17:00:53:207] 321 2 root DW 0 0.0 0 0.8 [kdrv_ise_cb_tsk]
[17:00:53:208] 797 2 root DW 0 0.0 0 0.7 [ctl_sie_buf_tsk]
[17:00:53:208] 80 2 root IW 0 0.0 0 0.5 [kworker/0:2-eve]
[17:00:53:208] 802 2 root DW 0 0.0 0 0.5 [ctl_ipp_tsk][17:20:53:342]Mem: 266212K used, 13040K free, 97084K shrd, 84K buff, 120592K cached(剩余还有13M,但kwapd0已经回收内存了)
[17:20:53:342]CPU: 2.1% usr 91.2% sys 0.0% nic 0.0% idle 0.0% io 0.0% irq 6.5% sirq
[17:20:53:342]Load average: 54.14 48.81 37.92 2/215 943
[17:20:53:342] PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
[17:20:53:343] 45 2 root RW 0 0.0 0 37.6 [kswapd0]
[17:20:53:343] 710 1 root S 810m296.9 0 18.9 [hicore]
[17:20:53:343] 923 2 root DW< 0 0.0 0 11.7 [kworker/0:1H+kb]
[17:20:53:345] 803 2 root DW 0 0.0 0 7.2 [kdf_ipp_tsk]
[17:20:53:345] 798 2 root DW 0 0.0 0 3.9 [ctl_sie_isp_tsk]
[17:20:53:346] 712 709 root S 175m 64.4 0 2.3 [uicore]
[17:20:53:347] 810 2 root DW 0 0.0 0 2.3 [isf_vext_tsk]
[17:20:53:348] 801 2 root DW 0 0.0 0 1.9 [ctl_ipp_isp_tsk]
[17:20:53:348] 502 2 root DW 0 0.0 0 1.8 [iq_tsk]
[17:20:53:348] 503 2 root DW 0 0.0 0 1.8 [iq_tsk]
[17:20:53:348] 320 2 root DW 0 0.0 0 1.2 [kdrv_ise_proc_t]
[17:20:53:348] 799 2 root DW 0 0.0 0 1.2 [ctl_ipp_buf_tsk]
[17:20:53:348] 934 2 root IW< 0 0.0 0 1.2 [kworker/0:0H-mm]
[17:20:53:349] 711 1 root S 137m 50.4 0 1.0 [Attendance]
[17:20:53:349] 943 168 root R 2488 0.8 0 0.9 [top]
[17:20:53:349] 321 2 root DW 0 0.0 0 0.9 [kdrv_ise_cb_tsk]
[17:20:53:349] 797 2 root DW 0 0.0 0 0.7 [ctl_sie_buf_tsk]
[17:20:53:349] 80 2 root IW 0 0.0 0 0.5 [kworker/0:2-eve]
[17:20:53:349] 274 2 root SW 0 0.0 0 0.3 [irq/54-DAI_INT]
3、某一次oom-killer信息
[16:54:34:509][ 1909.987571] Mem-Info:
[16:54:34:545][ 1909.995628] active_anon:27778 inactive_anon:23045 isolated_anon:0
[16:54:34:545][ 1909.995628] active_file:100 inactive_file:251 isolated_file:0
[16:54:34:556][ 1909.995628] unevictable:5409 dirty:0 writeback:0 unstable:0
[16:54:34:556][ 1909.995628] slab_reclaimable:2848 slab_unreclaimable:3776
[16:54:34:565][ 1909.995628] mapped:4369 shmem:24223 pagetables:321 bounce:0
[16:54:34:565][ 1909.995628] free:1463 free_pcp:1 free_cma:473
[16:54:34:664][ 1910.125245] Node 0 active_anon:111112kB inactive_anon:92180kB active_file:596kB inactive_file:1040kB unevictable:21636kB isolated(anon):0kB isolated(file):0kB mapped:17476kB dirty:0kB writeback:0kB shmem:96892kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[16:54:34:754][ 1910.215265] Normal free:5852kB min:4096kB low:5120kB high:6144kB active_anon:111112kB inactive_anon:92180kB active_file:116kB inactive_file:936kB unevictable:21636kB writepending:0kB present:303104kB managed:279252kB mlocked:0kB kernel_stack:1712kB pagetables:1284kB bounce:0kB free_pcp:68kB local_pcp:68kB free_cma:1892kB
[16:54:34:847][ 1910.325348] lowmem_reserve[]: 0 0 0
[16:54:34:870][ 1910.338165] Normal: 182*4kB (UMHC) 116*8kB (UMHC) 34*16kB (UMHC) 22*32kB (UMHC) 9*64kB (UHC) 6*128kB (MHC) 1*256kB (C) 1*512kB (M) 1*1024kB (M) 0*2048kB 0*4096kB = 6040kB
[16:54:34:920][ 1910.395461] 30069 total pagecache pages
[16:54:34:945][ 1910.420033] 75776 pages RAM
[16:54:34:951][ 1910.426929] 0 pages HighMem/MovableOnly
[16:54:34:986][ 1910.459569] 5963 pages reserved
[16:54:35:000][ 1910.470192] 1024 pages cma reserved
[16:54:35:015][ 1910.492102] Tasks state (memory values in pages):
[16:54:35:042][ 1910.515510] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[16:54:35:074][ 1910.550060] [ 134] 0 134 444 306 8192 0 -17 udevd
[16:54:35:112][ 1910.586591] [ 168] 0 168 622 427 10240 0 0 sh
[16:54:35:164][ 1910.635600] [ 614] 0 614 6162 355 16384 0 0 bsp_log4j
[16:54:35:208][ 1910.675376] [ 703] 0 703 8418 814 24576 0 0 log4j
[16:54:35:239][ 1910.705851] [ 717] 0 717 590 324 12288 0 0 exe
[16:54:35:268][ 1910.739982] [ 718] 0 718 207354 22887 444416 0 0 hicore
[16:54:35:304][ 1910.778779] [ 719] 0 719 35211 3135 53248 0 0 Attendance
[16:54:35:340][ 1910.814456] [ 720] 0 720 46546 5399 75776 0 0 uicore
[16:54:35:420][ 1910.859144] Out of memory: Kill process 718 (hicore) score 329 or sacrifice child
[16:54:35:437][1912.584] [DSP] WARN|md_drv.c|HikMd_drvProc|782: chan 0,last 1908210 ms,cur 1912460 ms,inter 4250 ms
[16:54:35:437][ 1910.890210] Killed process 718 (hicore) total-vm:829416kB, anon-rss:77604kB, file-rss:3476kB, shmem-rss:10192kB
[16:54:35:519] total used free shared buff/cache available
[16:54:35:525]Mem: 279252 84732 75292 96892 119228 75496
[16:54:35:546]Swap: [ 1910.997899] oom_reaper: reaped process 718 (hicore), now anon-rss:0kB, file-rss:0kB, shmem-rss:8kB
这篇关于问题描述:当前NT98520A平台在跑业务时频繁出现kswapd0线程升高导致CPU占用率100%,设备卡死的现象,设备在比较卡的情况下进行操作容易产生oom,且oom时设备可用内存高于水位线。的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!