All Questions
Tagged with kubernetes prometheus
60
questions
0
votes
0
answers
28
views
How to monitor multiple kubernetes clusters using grafana and prometheus from a separate cluster?
I have three kubernetes clusters. (monitoring, one, two). monitoring cluster is dedicated for monitoring tools. So on monitoring cluster, I will install prometheus and grafana. On other cluster should ...
0
votes
0
answers
35
views
Kubelet/Cadvisor on GKE not exporting container_fs_* metrics for attached volumes
In our GKE 1.27.12 cluster, we run a couple of stateful workloads using GCP Volumes, e.g. using this storage class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: pd-ssd
...
0
votes
0
answers
28
views
Error configuring node-exporter DaemonSet scraping for Prometheus on kubernetes - service account token missing
I have set up a node-exporter DaemonSet on kubernetes as well as a service that points to these node-exporter pods IPs (I followed this tutorial). When I run kubectl get endpoints -n monitoring, I ...
0
votes
1
answer
39
views
SOLVED - Error configuring node-exporter DaemonSet scraping for Prometheus on kubernetes
I am posting the following question already solved, because I've mistakenly posted in on StackOverflow and therefore wanted to share it here so it can be properly found by the community and hopefully ...
0
votes
0
answers
26
views
ServiceMonitor Relabeling EC2 Instance Tags Not Working in Kube-Prometheus-Stack
I'm using the kube-prometheus-stack Helm chart to deploy Prometheus and configure ServiceMonitors for monitoring EC2 instances with the Node Exporter. I have configured the ServiceMonitor to relabel ...
0
votes
0
answers
23
views
Setting up Prometheus on Azure Kubernetes Cluster
I am setting up Prometheus on a production aks cluster. The app deployment on this cluster is exposed using ngnix ingress behind a load balancer and a firewall device.
How do I access Prometheus ...
0
votes
0
answers
18
views
“Annotations.runbook” doesn't generate a link in a Slack message
In my k8s cluster, deploy kube-prometheus-stack, it is configured to send alert manager notifications to Slack. For some unknown reason, the Runbook icon is inactive and nothing happens when you click ...
0
votes
1
answer
162
views
How do we configure prometheus server to scrape metrics from a pod with Istio sidecar proxy?
A service pod is running with Istio sidecar container and is MTLS enabled. How do we define a service monitor to scrape metrics from this service ? Do we need to update the Prometheus server for the ...
0
votes
0
answers
41
views
kubelet_volume_* metrics not showing all PVCs
I use Prometheus to monitor my Kubernetes cluster. I have 42 PVCs Bound to 42 PVs. For some reason, kubelet_volume_* return information about only 29 of them. I looked everywhere and I could not find ...
1
vote
0
answers
47
views
Prometheus CPU consumption after remote_write is enabled
I have problems figuring out why my Prometheus instance starts to chew a lot of CPU after I enable remote_write feature.
I have deployed a prometheus and grafana from chart kube-prometheus-stack, ...
0
votes
0
answers
92
views
Kubernetes monitoring with Prometheus
I need a little bit of help here. I have a Kubernetes cluster up and running and I have a dedicated machine for monitoring with Prometheus running on it. I already have node exporters running and ...
0
votes
1
answer
379
views
How to monitor multiple Kubernetes clusters using single Grafana?
I would like to use a single Grafana instance to monitor multiple Kubernetes cluster (pods resources consumption, rabbitmq queues info) provided by Prometheus. I have two Kubernetes cluster - one used ...
0
votes
0
answers
114
views
Monitor RabbitMQ with Prometheus outside kubernetes cluster
I'm trying to monitor a RabbitMQ set up in a kubernetes cluster by the bitnami helm chart with 3 pods from an Prometheus outside the cluster and I don't understand how that should be possible.
On the ...
0
votes
1
answer
3k
views
Kubernetes - Find per core statistics for the pod
I would want to find the per core usage statistics for my Kubernetes pod.
In my Linux host/in the Kubernetes node, I use mpstat to find the statistics like below.
In my case, I assign 2.5 CPUs to the ...
0
votes
0
answers
205
views
How to track PersistentVolume usage with prometheus for non gp2 storage class?
I am running a kubenetes cluster in an ubuntu node. I have created a persistent volume (storage class: openebs-hostpath) and corresponding claim. I want to track how much of the claim is being used by ...
0
votes
0
answers
334
views
Auto-Instrumentation of application using OpenTelemetry
I have an AKS cluster where I am running a test Python-Django based web application.
I also have Grafana and Prometheus configured. I need to use OpenTelemetry to get the metrics data from the test ...
0
votes
0
answers
252
views
OpenTelemetry Collector Data not being fetched by Prometheus in Grafana
I have a requirement where I have some container workload in Azure AKS cluster and I need to use OpenTelemetry to gather data like metrics, logs and traces. I also have Grafana as the visualisation ...
0
votes
0
answers
178
views
OpenTelemetry K8s Operator Collector - Exporter Configuration for Prometheus
I have some container workloads in Azure AKS cluster. I need to use OpenTelemetry to get the metrics, logs and trace data from the container workload and get it collected by the OTEL collector.
I have ...
0
votes
0
answers
128
views
Spot instance sometime slow down and loss connection
I have a system deployed in AWS EKS, sometimes spot instances metrics is down, and API call to these nodes are very slow. Here is my system:
1 EKS cluster
1 on-demand node group
1 Karpenter v0.29.2 ...
0
votes
0
answers
342
views
Why Kube state metrics only shows metrics related to the namespace where it is running?
I have AWS EKS cluster with kube-state-metrics installed in a namespace called "monitoring". This installation is using service monitor and other components (see yaml files below).
In this ...
0
votes
0
answers
185
views
How to add configuration for fluent-plugin-prometheus in Fluentd deployed via Fleet in Rancher?
I'm using Rancher to manage my Kubernetes cluster and have added a logging system (cattle-logging-system) via Fleet. I now need to add monitoring for Fluentd using the fluent-plugin-prometheus.
Here's ...
0
votes
0
answers
299
views
Thanos Receiver not deleting old data in Persistent Volume (PV) after retention is exceeded
I have set up Kube Prometheus Stack with Thanos on my Kubernetes cluster, and I'm using the Thanos Receiver instead of the sidecar approach. I have also configured the Thanos Compactor and Minio for ...
0
votes
0
answers
214
views
Spike in Cadvisor container_network_receive_bytes_total Metric in a Kubernetes Cluster
Summary:
I'm using Cadvisor with Prometheus in multiple Kubernetes (k8s) clusters to monitor network traffic usage. I utilize the container_network_receive_bytes_total metric in a query to calculate ...
0
votes
0
answers
224
views
Prometheus Server Pod Suddenly Crashed (unexpected fault address 0x7f911b1795d4)
With traffic Prometheus server pod getting restarting with below error stack. This is happend when the live traffic of the system. But could not be able to reproduced with the load testing.
Grafana ...
1
vote
1
answer
47
views
How to safely update an existing Kubernetes server without original configurations
I was handed a Kubernetes cluster with no config files, and was not setup with helm. The author said they just created everything from the cmd line. It is a small/new cluster for a single API server ...
0
votes
0
answers
77
views
Restrict access to a Prometheus server in AKS can only be achieved with nginx-ingress?
Prometheus server with its respective Loadbalancer in AKS.
I wanted to secure the access to /metrics through network rules...but it doesn't work. I can still acess to the endpoint with any device.
...
0
votes
0
answers
272
views
How can I know request waiting time in Nginx Ingress Controller?
We use Kubernetes with Nginx Ingress Controller to run our platform with various backend services. We also use New Relic (& Prometheus, Grafana) for our Observability dashboards & alerts. ...
0
votes
1
answer
246
views
prometheus-operator when configuring alertmanager config for PagerDuty
Have next issue:
When try to set up alertmanager configuration over CRD, get wrong configuration on pod.
Problem look like:
- routing_key: |
*****
routing_key parameter after all CD use ...
0
votes
0
answers
240
views
kube-prometheus-stack redundancy across multiple clusters
I currently use kube-prometheus-stack to monitor several kubernetes clusters. Each cluster has its own deployment of the kube-prometheus-stack, however, there is currently only one cluster (a) that ...
3
votes
2
answers
2k
views
Debugging Prometheus OOMkilled despite 6Gi limits
I'm at the end of my patience with a prometheus setup leveraging kube-prometheus-stack 44.3.0 (latest being 45).
I have two environments, staging and prod. In staging, my prometheus runs smoothly. In ...
0
votes
1
answer
3k
views
kube-api server high cpu
I want to know how I can check why one of my ctrl node and kubernetes consumes more cpu than the others.
I have a cluster with 3 ctrl nodes and 4 worker nodes.
I have an nginx load balancer with the ...
0
votes
1
answer
2k
views
Kubernetes upgrade from 1.21 to 1.22 caused Prometheus to fail
We recently upgraded Kubernetes 1.21 to 1.22 version on aws eks. The upgrade was successful. However, the associated prometheus deployments fails with error
$ kubectl -n monitoring logs prometheus-...
1
vote
1
answer
4k
views
Prometheus: Add insecure_skip_verify via annotation or scrape config adaption for kubernetes pods
I`ve running a kubernetes cluster with a deployment of some pods. One pod provides metrics on a https secured endpoint. The problem is, that this pod create and use his own self-signed certificate and ...
1
vote
2
answers
4k
views
Grafana pod is not running, how to fix that?
I have deployed grafana in eks using the steps provided in this link
After deployment of grafana, the pod is not in running state.
kubectl get po -n grafana
NAME READY STATUS ...
1
vote
1
answer
5k
views
Alertmanager telegram config chat_id and cannot unmarshal errror
I am trying to configure alertmanager to send alerts to my telegram group. Following the configuration I have:
global:
resolve_timeout: 5m
route:
group_by:
- job
group_interval: 5m
...
0
votes
1
answer
1k
views
webhook MS Teams integration with Prometheus - request failed
I'm struggling with Microsoft Teams/Prometheus integration on K8s cluster.
I used helm to start all components.
I have correctly working Prometheus and Alertmanager. It seems that all works fine. ...
0
votes
0
answers
642
views
microk8s: pod resource usage metrics not available from all nodes
I am running microk8s v1.22/stable on a Linux cluster with 11 nodes. I have enabled the metrics-server plugin and installed Prometheus via the Helm chart with nodeExporter and kubeStateMetrics enabled....
3
votes
1
answer
2k
views
Kubernetes Nginx Ingress Controller Metrics
I've tried to find a documentation about the metrics exposed by the NGINX ingress controller in Kubernetes but so far I haven't found any reliable source about the metrics and what they mean.
For ...
0
votes
1
answer
78
views
Better way at scale to pull image uris from all pod specs on k8s cluster
Team, I see this [list-all-running-container-image]https://kubernetes.io/docs/tasks/access-application-cluster/list-all-running-container-images/. however, I cannot bank operations on this because it ...
2
votes
1
answer
2k
views
Why K8S statefulsets volumeClaimTemplates status is pending , but the pod, pvc, pv are all fine?
I use nfs-subdir-external-provisioner as automatic pv provisioner for my promethues (by promethues-operator )
I have created our sts,pod,pvc,pv successfully and everything looks fine.
But if I use ...
1
vote
2
answers
2k
views
Prometheus not connected to alert manager in GKE
I installed kube-prometheus-stack 15.3.1 into a GKE cluster using helm (in "monitoring" namespace). I used the values.yaml to open up ingresses on some of the components and to add SMTP info ...
-1
votes
1
answer
726
views
How to track down this spike in Disk IO every 5 minutes on Ubuntu Server / microk8s
I've set up an Ubuntu Server with microk8s, with the dns, dashboard and prometheus addons. It's running some Cardano nodes.
On the (built-in) Grafana dashboard "Default / Nodes" I see spikes ...
2
votes
1
answer
158
views
Spikes on External Metric scales HPA when it shouldn't
I have a metric that I’m using for a HPA, the problem is that the metric has spikes and to avoid so, I’m using an average over time in a record rule on prometheus, but to export it to stackdriver(on ...
1
vote
1
answer
2k
views
Prometheus auto scrape metrics from multiple kube-state-metrics in kubernetes?
I want to use a kubernetes(cluster-0)with multiple kube-state-metrics to monitor multiple other kubernetes cluster(cluster-1,2,3,4)
In the (cluster-0), I split into multiple namespaces like this:
...
5
votes
1
answer
6k
views
Prometheus - Use case of service discovery by role endpoints and role pod in Kubernetes
While reading Prometheus Configuration documentation and some sample scrape configurations, I found some kubernetes_sd_configs with role service & role endpoints & role pod
- job_name: kube-...
2
votes
1
answer
458
views
Pod using Vernemq helm package cannot start
I'm using helm to install vernemq on my kubernetes cluster
The problems is it can't start, I accepted the EULA
Here is the log:
02:31:56.552 [error] CRASH REPORT Process <0.195.0> with 0 ...
2
votes
2
answers
8k
views
What does it mean to have more than one instance of Prometheus in Kubernetes
Suppose I'm using a volume to persist my Prometheus data, I wonder if I can have more than one instance of it running to have high availability.
I believe only one instance of Prometheus must be in ...
1
vote
2
answers
1k
views
Missing metrics for "kubelet_volume_*" in Prometheus
I setup latest https://github.com/coreos/kube-prometheus/ in an AWS EKS cluster in which I'm using the Amazon EBS CSI driver for persistent volume claims, but I don't see any "kubelet_volume_*" ...
1
vote
0
answers
448
views
kube-state-metrics doesn't show hardware utilization
I installed this yamls
https://github.com/kubernetes/kube-state-metrics/tree/master/examples/standard
I can see CPU per pod utilization only from system pods such as: calico-node, coredns, ...
2
votes
1
answer
257
views
How to trigger alerts in Prometheus when specific users login to OpenShift or Kubernetes?
Using either kube_state_metrics or anything else I'd like to fire alerts in Prometheus AlertManager when a specific user logs in to the cluster, ie. kubeadmin or bob-smith.
Or in other words: where in ...