0

I created a PRIVATE EKS Cluster using AWS Console. Then, followed the documentation to configure Fargate. After I finished, I can see my Fargate nodes under Compute tab in my cluster in AWS Console, but the CoreDNS pods running in the nodes are failing when pulling the image:

$ kubectl get pods -n kube-system
NAME                       READY   STATUS             RESTARTS   AGE
coredns-58488c5db-j8l9f    0/1     Pending            0          2d2h
coredns-7c969d8cd7-7xztr   0/1     ImagePullBackOff   0          3h30m
coredns-7c969d8cd7-mh6z2   0/1     ImagePullBackOff   0          3h30m

Another thing that I noticed is that kube-system namespace has no endponints:

kubectl -n kube-system get endpoints kube-dns
NAME       ENDPOINTS   AGE
kube-dns               2d2h

Here is a description of one of the pods:

Containers:
  coredns:
    Container ID:
    Image:         602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2
    Image ID:
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:    http-get http://:8080/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (ro)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9mhnd (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  kube-api-access-9mhnd:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 CriticalAddonsOnly op=Exists
                             node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                    From     Message
  ----     ------   ----                   ----     -------
  Warning  Failed   45m (x5 over 4h40m)    kubelet  Failed to pull image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to pull and unpack image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to resolve reference "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to do request: Head "https://602401143452.dkr.ecr.us-east-1.amazonaws.com/v2/eks/coredns/manifests/v1.10.1-eksbuild.2": dial tcp 54.211.105.2:443: i/o timeout
  Warning  Failed   25m (x10 over 4h42m)   kubelet  Failed to pull image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to pull and unpack image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to resolve reference "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to do request: Head "https://602401143452.dkr.ecr.us-east-1.amazonaws.com/v2/eks/coredns/manifests/v1.10.1-eksbuild.2": dial tcp 34.198.77.233:443: i/o timeout
  Normal   BackOff  10m (x915 over 4h43m)  kubelet  Back-off pulling image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2"
  Warning  Failed   5m9s (x10 over 4h34m)  kubelet  Failed to pull image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to pull and unpack image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to resolve reference "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2": failed to do request: Head "https://602401143452.dkr.ecr.us-east-1.amazonaws.com/v2/eks/coredns/manifests/v1.10.1-eksbuild.2": dial tcp 52.207.2.251:443: i/o timeout
  Normal   Pulling  5s (x47 over 4h45m)    kubelet  Pulling image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/coredns:v1.10.1-eksbuild.2"

What could be happening? I dont want to create another cluster like is suggested in https://stackoverflow.com/questions/66996306/aws-eks-fargate-coredns-imagepullbackoff

1 Answer 1

0

From the error message fargate pods cannot reach ecr endpoint to pull the coredns image. Check

  • The subnet associated with the fargate profile has route to the ecr(either through nat gateway, tgw or vpc endpoints).
  • The security group associated with your fargate coredns pod(by defauly cluster security group) allows outbound internet traffic.
  • The ingress and egress rule for your network access control lists (ACLs) allows access to the ecr.
New contributor
user25809960 is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .