Red Hat 7.9 server with 512 GB RAM.
We often have alerts about swap being full. Swap is often used 99%. Our server admin told us it is normal for linux to have swap used 100%. There is no way to check the real RAM consumption.
Ours is always around 99% on our server using SAR:
Memory & Swap
=============
kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
01/11/2024 9869583 517833049 98.13 977 285023790 319073740 58.60 340613912 162678784 614843
01/12/2024 4181004 523521628 99.21 1349 287757937 323767076 59.46 346046454 162168337 1115633
01/13/2024 2567787 525134845 99.51 844 285755715 327744180 60.19 352144911 157031086 654493
01/14/2024 2827695 524874937 99.46 844 285135562 328070493 60.25 352003644 156742082 742178
01/15/2024 3482087 524220545 99.34 1838 280133083 332837943 61.13 353676271 153986907 998373
01/16/2024 2152990 525549642 99.59 839 273578756 342252974 62.86 362157756 147013751 1136099
01/17/2024 4575639 523126993 99.13 2418 271393967 340334778 62.51 355987093 150884756 1033531
01/18/2024 2205445 525497187 99.58 2413 282078831 328216144 60.28 353148916 156770066 625021
01/19/2024 9451354 518251278 98.21 1542 293648210 305176716 56.05 352495156 150264497 680464
On the web I can see that it is not normal to have high swap usage. I also noticed that after a reboot, swap is very low during several days. When a process consumes all the RAM (mistake on our side), then swap usage increases to 100%, and then never decreases, even if the corresponding process is killed. This is why servers that are on for weeks have 100% swap used.
I was told to monitor the swap usage using sar (pswpin/s pswpout/s).
When swap usage is 100%, I could have no problem but when processes start being killed because of RAM issues, I can see high values for pswpin/s pswpout/s (sar -W).
During a week I can monitor this activity to check if I had RAM problems or not.
My problem is the following: How can I prevent the RAM problem from occurring? What can I use to check the % of RAM used (whereas it is always 99% in SAR...)? How to get the real value like in Windows OS? To be sure to kill the process that start taking all the RAM.
I would like to generate a warning when RAM usage is 80%.
I know that free -h could be used but I don't know how to interpret that. the same for "top".
For r instance, I compare sar output with free -h and top output and I don't see matching values... :-(
[XXXXX@YYYYYY ~]$ sar -r
Linux 3.10.0-1062.4.1.el7.x86_64 (XXXXXX) 02/02/2024 _x86_64_ (64 CPU)
10:30:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
11:55:01 AM 7916396 519786232 98.50 4688 436720908 164442944 30.20 256757744 249531760 1548
12:00:01 PM 1106680 526595948 99.79 4688 436883120 173459020 31.86 263356476 249699552 4184
12:05:01 PM 742056 526960572 99.86 4688 436447380 173668428 31.90 264216776 249402376 5380
12:10:01 PM 11076780 516625848 97.90 4688 434763392 162944064 29.93 255501116 247835888 3732
12:15:01 PM 7891084 519811544 98.50 4688 434921220 165981024 30.48 258656448 247885164 600
Average: 2201518 525501110 99.58 5447 448667788 154099963 28.30 257069976 255388912 3431
[XXXXX@YYYYYY ~]$ free -h
total used free shared buff/cache available
Mem: 503G 81G 4.3G 1.3G 417G 419G
Swap: 15G 11G 4.4G
[XXXXX@YYYYYY ~]$ top
top - 12:22:09 up 9 days, 18:55, 48 users, load average: 4.21, 4.69, 5.06
Tasks: 2645 total, 5 running, 2488 sleeping, 0 stopped, 152 zombie
%Cpu(s): 6.8 us, 3.4 sy, 0.0 ni, 89.8 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 52770262+total, 6888876 free, 83310656 used, 43750310+buff/cache
KiB Swap: 16777212 total, 4582892 free, 12194320 used. 44195168+avail Mem
I don't know which values to check on top or free -h to know the real % of RAM consumption usage.
thanks a lot for helping
top -b -o %MEM -n 1 | head -30
when you get your swap full and add the output to the question.