Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
0 votes
0 answers
28 views

How to monitor multiple kubernetes clusters using grafana and prometheus from a separate cluster?

I have three kubernetes clusters. (monitoring, one, two). monitoring cluster is dedicated for monitoring tools. So on monitoring cluster, I will install prometheus and grafana. On other cluster should ...
Mysterio's user avatar
0 votes
0 answers
35 views

Kubelet/Cadvisor on GKE not exporting container_fs_* metrics for attached volumes

In our GKE 1.27.12 cluster, we run a couple of stateful workloads using GCP Volumes, e.g. using this storage class: apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: pd-ssd ...
antaxify's user avatar
  • 123
0 votes
0 answers
28 views

Error configuring node-exporter DaemonSet scraping for Prometheus on kubernetes - service account token missing

I have set up a node-exporter DaemonSet on kubernetes as well as a service that points to these node-exporter pods IPs (I followed this tutorial). When I run kubectl get endpoints -n monitoring, I ...
Paula Gouveia's user avatar
0 votes
1 answer
39 views

SOLVED - Error configuring node-exporter DaemonSet scraping for Prometheus on kubernetes

I am posting the following question already solved, because I've mistakenly posted in on StackOverflow and therefore wanted to share it here so it can be properly found by the community and hopefully ...
Paula Gouveia's user avatar
0 votes
0 answers
26 views

ServiceMonitor Relabeling EC2 Instance Tags Not Working in Kube-Prometheus-Stack

I'm using the kube-prometheus-stack Helm chart to deploy Prometheus and configure ServiceMonitors for monitoring EC2 instances with the Node Exporter. I have configured the ServiceMonitor to relabel ...
TomerA's user avatar
  • 1
0 votes
0 answers
23 views

Setting up Prometheus on Azure Kubernetes Cluster

I am setting up Prometheus on a production aks cluster. The app deployment on this cluster is exposed using ngnix ingress behind a load balancer and a firewall device. How do I access Prometheus ...
sakshi's user avatar
  • 1
0 votes
0 answers
18 views

“Annotations.runbook” doesn't generate a link in a Slack message

In my k8s cluster, deploy kube-prometheus-stack, it is configured to send alert manager notifications to Slack. For some unknown reason, the Runbook icon is inactive and nothing happens when you click ...
User.354125's user avatar
0 votes
1 answer
162 views

How do we configure prometheus server to scrape metrics from a pod with Istio sidecar proxy?

A service pod is running with Istio sidecar container and is MTLS enabled. How do we define a service monitor to scrape metrics from this service ? Do we need to update the Prometheus server for the ...
Nipun Talukdar's user avatar
0 votes
0 answers
41 views

kubelet_volume_* metrics not showing all PVCs

I use Prometheus to monitor my Kubernetes cluster. I have 42 PVCs Bound to 42 PVs. For some reason, kubelet_volume_* return information about only 29 of them. I looked everywhere and I could not find ...
Mohamed Aziz Tousli's user avatar
1 vote
0 answers
47 views

Prometheus CPU consumption after remote_write is enabled

I have problems figuring out why my Prometheus instance starts to chew a lot of CPU after I enable remote_write feature. I have deployed a prometheus and grafana from chart kube-prometheus-stack, ...
Antonio Soldo's user avatar
0 votes
0 answers
92 views

Kubernetes monitoring with Prometheus

I need a little bit of help here. I have a Kubernetes cluster up and running and I have a dedicated machine for monitoring with Prometheus running on it. I already have node exporters running and ...
skylar's user avatar
  • 1
0 votes
1 answer
379 views

How to monitor multiple Kubernetes clusters using single Grafana?

I would like to use a single Grafana instance to monitor multiple Kubernetes cluster (pods resources consumption, rabbitmq queues info) provided by Prometheus. I have two Kubernetes cluster - one used ...
FN_'s user avatar
  • 273
0 votes
0 answers
114 views

Monitor RabbitMQ with Prometheus outside kubernetes cluster

I'm trying to monitor a RabbitMQ set up in a kubernetes cluster by the bitnami helm chart with 3 pods from an Prometheus outside the cluster and I don't understand how that should be possible. On the ...
BMaehr's user avatar
  • 1
0 votes
1 answer
3k views

Kubernetes - Find per core statistics for the pod

I would want to find the per core usage statistics for my Kubernetes pod. In my Linux host/in the Kubernetes node, I use mpstat to find the statistics like below. In my case, I assign 2.5 CPUs to the ...
Vishal Raghavan's user avatar
0 votes
0 answers
205 views

How to track PersistentVolume usage with prometheus for non gp2 storage class?

I am running a kubenetes cluster in an ubuntu node. I have created a persistent volume (storage class: openebs-hostpath) and corresponding claim. I want to track how much of the claim is being used by ...
sachinks's user avatar
0 votes
0 answers
334 views

Auto-Instrumentation of application using OpenTelemetry

I have an AKS cluster where I am running a test Python-Django based web application. I also have Grafana and Prometheus configured. I need to use OpenTelemetry to get the metrics data from the test ...
arjunbnair's user avatar
0 votes
0 answers
252 views

OpenTelemetry Collector Data not being fetched by Prometheus in Grafana

I have a requirement where I have some container workload in Azure AKS cluster and I need to use OpenTelemetry to gather data like metrics, logs and traces. I also have Grafana as the visualisation ...
arjunbnair's user avatar
0 votes
0 answers
178 views

OpenTelemetry K8s Operator Collector - Exporter Configuration for Prometheus

I have some container workloads in Azure AKS cluster. I need to use OpenTelemetry to get the metrics, logs and trace data from the container workload and get it collected by the OTEL collector. I have ...
arjunbnair's user avatar
0 votes
0 answers
128 views

Spot instance sometime slow down and loss connection

I have a system deployed in AWS EKS, sometimes spot instances metrics is down, and API call to these nodes are very slow. Here is my system: 1 EKS cluster 1 on-demand node group 1 Karpenter v0.29.2 ...
Tristan's user avatar
  • 21
0 votes
0 answers
342 views

Why Kube state metrics only shows metrics related to the namespace where it is running?

I have AWS EKS cluster with kube-state-metrics installed in a namespace called "monitoring". This installation is using service monitor and other components (see yaml files below). In this ...
Thiago Scodeler's user avatar
0 votes
0 answers
185 views

How to add configuration for fluent-plugin-prometheus in Fluentd deployed via Fleet in Rancher?

I'm using Rancher to manage my Kubernetes cluster and have added a logging system (cattle-logging-system) via Fleet. I now need to add monitoring for Fluentd using the fluent-plugin-prometheus. Here's ...
Maksim Karibov's user avatar
0 votes
0 answers
299 views

Thanos Receiver not deleting old data in Persistent Volume (PV) after retention is exceeded

I have set up Kube Prometheus Stack with Thanos on my Kubernetes cluster, and I'm using the Thanos Receiver instead of the sidecar approach. I have also configured the Thanos Compactor and Minio for ...
dasunNimantha's user avatar
0 votes
0 answers
214 views

Spike in Cadvisor container_network_receive_bytes_total Metric in a Kubernetes Cluster

Summary: I'm using Cadvisor with Prometheus in multiple Kubernetes (k8s) clusters to monitor network traffic usage. I utilize the container_network_receive_bytes_total metric in a query to calculate ...
Hesam Norin's user avatar
0 votes
0 answers
224 views

Prometheus Server Pod Suddenly Crashed (unexpected fault address 0x7f911b1795d4)

With traffic Prometheus server pod getting restarting with below error stack. This is happend when the live traffic of the system. But could not be able to reproduced with the load testing. Grafana ...
Sidath Weerasinghe's user avatar
1 vote
1 answer
47 views

How to safely update an existing Kubernetes server without original configurations

I was handed a Kubernetes cluster with no config files, and was not setup with helm. The author said they just created everything from the cmd line. It is a small/new cluster for a single API server ...
Supernat's user avatar
0 votes
0 answers
77 views

Restrict access to a Prometheus server in AKS can only be achieved with nginx-ingress?

Prometheus server with its respective Loadbalancer in AKS. I wanted to secure the access to /metrics through network rules...but it doesn't work. I can still acess to the endpoint with any device. ...
Wadjet's user avatar
  • 1
0 votes
0 answers
272 views

How can I know request waiting time in Nginx Ingress Controller?

We use Kubernetes with Nginx Ingress Controller to run our platform with various backend services. We also use New Relic (& Prometheus, Grafana) for our Observability dashboards & alerts. ...
Raman Kishore's user avatar
0 votes
1 answer
246 views

prometheus-operator when configuring alertmanager config for PagerDuty

Have next issue: When try to set up alertmanager configuration over CRD, get wrong configuration on pod. Problem look like: - routing_key: | ***** routing_key parameter after all CD use ...
Georgy Potapov's user avatar
0 votes
0 answers
240 views

kube-prometheus-stack redundancy across multiple clusters

I currently use kube-prometheus-stack to monitor several kubernetes clusters. Each cluster has its own deployment of the kube-prometheus-stack, however, there is currently only one cluster (a) that ...
I. Shm's user avatar
  • 31
3 votes
2 answers
2k views

Debugging Prometheus OOMkilled despite 6Gi limits

I'm at the end of my patience with a prometheus setup leveraging kube-prometheus-stack 44.3.0 (latest being 45). I have two environments, staging and prod. In staging, my prometheus runs smoothly. In ...
Liquid's user avatar
  • 141
0 votes
1 answer
3k views

kube-api server high cpu

I want to know how I can check why one of my ctrl node and kubernetes consumes more cpu than the others. I have a cluster with 3 ctrl nodes and 4 worker nodes. I have an nginx load balancer with the ...
user avatar
0 votes
1 answer
2k views

Kubernetes upgrade from 1.21 to 1.22 caused Prometheus to fail

We recently upgraded Kubernetes 1.21 to 1.22 version on aws eks. The upgrade was successful. However, the associated prometheus deployments fails with error $ kubectl -n monitoring logs prometheus-...
vijaya lakshmi's user avatar
1 vote
1 answer
4k views

Prometheus: Add insecure_skip_verify via annotation or scrape config adaption for kubernetes pods

I`ve running a kubernetes cluster with a deployment of some pods. One pod provides metrics on a https secured endpoint. The problem is, that this pod create and use his own self-signed certificate and ...
Volker Raschek's user avatar
1 vote
2 answers
4k views

Grafana pod is not running, how to fix that?

I have deployed grafana in eks using the steps provided in this link After deployment of grafana, the pod is not in running state. kubectl get po -n grafana NAME READY STATUS ...
user2331760's user avatar
1 vote
1 answer
5k views

Alertmanager telegram config chat_id and cannot unmarshal errror

I am trying to configure alertmanager to send alerts to my telegram group. Following the configuration I have: global: resolve_timeout: 5m route: group_by: - job group_interval: 5m ...
Jose's user avatar
  • 21
0 votes
1 answer
1k views

webhook MS Teams integration with Prometheus - request failed

I'm struggling with Microsoft Teams/Prometheus integration on K8s cluster. I used helm to start all components. I have correctly working Prometheus and Alertmanager. It seems that all works fine. ...
RedBluff's user avatar
0 votes
0 answers
642 views

microk8s: pod resource usage metrics not available from all nodes

I am running microk8s v1.22/stable on a Linux cluster with 11 nodes. I have enabled the metrics-server plugin and installed Prometheus via the Helm chart with nodeExporter and kubeStateMetrics enabled....
mhusaini's user avatar
  • 101
3 votes
1 answer
2k views

Kubernetes Nginx Ingress Controller Metrics

I've tried to find a documentation about the metrics exposed by the NGINX ingress controller in Kubernetes but so far I haven't found any reliable source about the metrics and what they mean. For ...
MysteriousPerson's user avatar
0 votes
1 answer
78 views

Better way at scale to pull image uris from all pod specs on k8s cluster

Team, I see this [list-all-running-container-image]https://kubernetes.io/docs/tasks/access-application-cluster/list-all-running-container-images/. however, I cannot bank operations on this because it ...
AhmFM's user avatar
  • 119
2 votes
1 answer
2k views

Why K8S statefulsets volumeClaimTemplates status is pending , but the pod, pvc, pv are all fine?

I use nfs-subdir-external-provisioner as automatic pv provisioner for my promethues (by promethues-operator ) I have created our sts,pod,pvc,pv successfully and everything looks fine. But if I use ...
Jeffery's user avatar
  • 23
1 vote
2 answers
2k views

Prometheus not connected to alert manager in GKE

I installed kube-prometheus-stack 15.3.1 into a GKE cluster using helm (in "monitoring" namespace). I used the values.yaml to open up ingresses on some of the components and to add SMTP info ...
Toby 1 Kenobi's user avatar
-1 votes
1 answer
726 views

How to track down this spike in Disk IO every 5 minutes on Ubuntu Server / microk8s

I've set up an Ubuntu Server with microk8s, with the dns, dashboard and prometheus addons. It's running some Cardano nodes. On the (built-in) Grafana dashboard "Default / Nodes" I see spikes ...
Danny Tuppeny's user avatar
2 votes
1 answer
158 views

Spikes on External Metric scales HPA when it shouldn't

I have a metric that I’m using for a HPA, the problem is that the metric has spikes and to avoid so, I’m using an average over time in a record rule on prometheus, but to export it to stackdriver(on ...
José Pedro Machado's user avatar
1 vote
1 answer
2k views

Prometheus auto scrape metrics from multiple kube-state-metrics in kubernetes?

I want to use a kubernetes(cluster-0)with multiple kube-state-metrics to monitor multiple other kubernetes cluster(cluster-1,2,3,4) In the (cluster-0), I split into multiple namespaces like this: ...
Lê Minh Quân's user avatar
5 votes
1 answer
6k views

Prometheus - Use case of service discovery by role endpoints and role pod in Kubernetes

While reading Prometheus Configuration documentation and some sample scrape configurations, I found some kubernetes_sd_configs with role service & role endpoints & role pod - job_name: kube-...
jiangwei's user avatar
2 votes
1 answer
458 views

Pod using Vernemq helm package cannot start

I'm using helm to install vernemq on my kubernetes cluster The problems is it can't start, I accepted the EULA Here is the log: 02:31:56.552 [error] CRASH REPORT Process <0.195.0> with 0 ...
Lê Minh Quân's user avatar
2 votes
2 answers
8k views

What does it mean to have more than one instance of Prometheus in Kubernetes

Suppose I'm using a volume to persist my Prometheus data, I wonder if I can have more than one instance of it running to have high availability. I believe only one instance of Prometheus must be in ...
Ali Tou's user avatar
  • 121
1 vote
2 answers
1k views

Missing metrics for "kubelet_volume_*" in Prometheus

I setup latest https://github.com/coreos/kube-prometheus/ in an AWS EKS cluster in which I'm using the Amazon EBS CSI driver for persistent volume claims, but I don't see any "kubelet_volume_*" ...
Catalin's user avatar
  • 21
1 vote
0 answers
448 views

kube-state-metrics doesn't show hardware utilization

I installed this yamls https://github.com/kubernetes/kube-state-metrics/tree/master/examples/standard I can see CPU per pod utilization only from system pods such as: calico-node, coredns, ...
user227685's user avatar
2 votes
1 answer
257 views

How to trigger alerts in Prometheus when specific users login to OpenShift or Kubernetes?

Using either kube_state_metrics or anything else I'd like to fire alerts in Prometheus AlertManager when a specific user logs in to the cluster, ie. kubeadmin or bob-smith. Or in other words: where in ...
funix's user avatar
  • 21