0

I am planning to use Descheduler in my AKS deployment to balance memory consumption of AKS nodes. My current output of kubectl top nodes is:

NAME                                CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   

aks-nodepool1-53884836-vmss000000   198m         10%    12317Mi         97%       
aks-nodepool1-53884836-vmss000001   189m         9%     12952Mi         102%      
aks-nodepool1-53884836-vmss000002   213m         11%    12747Mi         101%      
aks-nodepool1-53884836-vmss000003   135m         7%     5970Mi          47%    

However when i tried different scenarios in Descheduler i got following output

I0612 13:51:55.145678       1 nodeutilization.go:204] "Node is underutilized" node="aks-nodepool1-53884836-vmss000003" usage={"cpu":"810m","memory":"476Mi","pods":"27"} usagePercentage={"cpu":42.63,"memory":3.78,"pods":10.8}
I0612 13:51:55.145712       1 nodeutilization.go:204] "Node is underutilized" node="aks-nodepool1-53884836-vmss000000" usage={"cpu":"582m","memory":"501Mi","pods":"54"} usagePercentage={"cpu":30.63,"memory":3.98,"pods":21.6}
I0612 13:51:55.145725       1 nodeutilization.go:204] "Node is underutilized" node="aks-nodepool1-53884836-vmss000001" usage={"cpu":"950m","memory":"596Mi","pods":"61"} usagePercentage={"cpu":50,"memory":4.74,"pods":24.4}
I0612 13:51:55.145743       1 nodeutilization.go:210] "Node is appropriately utilized" node="aks-nodepool1-53884836-vmss000002" usage={"cpu":"962m","memory":"647Mi","pods":"56"} usagePercentage={"cpu":50.63,"memory":5.14,"pods":22.4}

As you can see that utilization seen by Descheduler is drastically different from what top is reporting. Especially memory which is not more than 5% utilized in any of the nodes whereas top is reporting 47% and above

When i describe the node i see that utilization of all the custom pods as 0.

  Namespace                   Name                                                               CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                                               ------------  ----------  ---------------  -------------  ---

  default                     alerts-667b7bc-88djq                                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     alerts-ag-5544b98c45-xjnss                                            0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     api-6db9645d8b-p6jqm                                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     authentication-766496cf6b-js77v                                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     authentication-ag-585fdf767nsp6                                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     authentication-validator-76b444457c-6x66x                             0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     authorization-checker-5789b576ff-wssl2                                0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     authorization-78f759f849-xmk2p                                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     backups-agent-68f47f764c-vpmlh                                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d
  default                     backups-f7d6c765d-qsl8v                                               0 (0%)        0 (0%)      0 (0%)           0 (0%)         32d

....

Allocated resources:

  (Total limits may be over 100 percent, i.e., overcommitted.)

  Resource           Requests    Limits
  --------           --------    ------

  cpu                582m (30%)  4647m (244%)
  memory             501Mi (3%)  3657Mi (29%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)

Describe node allocated resources seem to conform with what descheduler is seeing. Given that utilization of all the custom pods is shown as 0%. Getting top pod for one of the pods e.g. kubectl top pods alerts-667b7bc-88djq

NAME                            CPU(cores)   MEMORY(bytes)   
alerts-667b7bc-88djq             2m           108Mi           

PodMetrics seems to agree with this. kubectl describe PodMetrics alerts-667b7bc-88djq

API Version:  metrics.k8s.io/v1beta1
Containers:
  Name:  alerts
  Usage:
    Cpu:     1268872n
    Memory:  111272Ki
Kind:        PodMetrics

Any help understanding whats going on here. Why describe node is failing to register any resource utilization (and subsequently descheduler reporting the same) whereas top nodes is presenting a totally different picture ?

1 Answer 1

0

So it turns out that AKS will not measure the resource utilization of a pod unless you define resource limits for it.

spec:
  containers:
  - name: qos-demo-ctr
    image: nginx
    resources:
      limits:
        memory: "200Mi"
        cpu: "700m"
      requests:
        memory: "200Mi"
        cpu: "700m"

Resource section with request and limits is required for AKS to recognize and start reporting resource usage.

Reference: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .