0

I am posting the following question already solved, because I've mistakenly posted in on StackOverflow and therefore wanted to share it here so it can be properly found by the community and hopefully help others with the same issue.


I have set up a node-exporter DaemonSet on kubernetes as well as a service that points to these node-exporter pods IPs (I followed this tutorial). When I run kubectl get endpoints -n monitoring, I verify that the service is correctly pointing to the 3 DaemonSet pods that were created.

After that, inside the prometheus.yml file I have added this config for scraping the node-exporter metrics:

  - job_name: "node-exporter"
    kubernetes_sd_configs:
      - role: endpoints
        namespaces:
          names:
          - monitoring
    relabel_configs:
      - source_labels: [__meta_kubernetes_endpoints_name]
        regex: "node-exporter"
        action: keep

The problem is when I apply these configs and restart the prometheus.service:

> systemctl status prometheys.service --no-pager --full

● prometheus.service - PromServer
     Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2024-06-18 16:24:18 UTC; 1min 41s ago
   Main PID: 1441565 (prometheus)
      Tasks: 10 (limit: 33613)
     Memory: 38.2M
        CPU: 325ms
     CGroup: /system.slice/prometheus.service
             └─1441565 /usr/local/bin/prometheus --web.enable-admin-api --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /var/lib/prometheus/ --web.console.templates=/etc/prometheus/consoles --web.console.libraries=/etc/prometheus/console_libraries

Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.654Z caller=head.go:755 level=info component=tsdb msg="WAL segment loaded" segment=158 maxSegment=159
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.655Z caller=head.go:755 level=info component=tsdb msg="WAL segment loaded" segment=159 maxSegment=159
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.655Z caller=head.go:792 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=7.37407ms wal_replay_duration=43.4001ms wbl_replay_duration=200ns total_replay_duration=51.456586ms
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.657Z caller=main.go:1040 level=info fs_type=EXT4_SUPER_MAGIC
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.657Z caller=main.go:1043 level=info msg="TSDB started"
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.657Z caller=main.go:1224 level=info msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.661Z caller=manager.go:317 level=error component="discovery manager scrape" msg="Cannot create service discovery" err="unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined" type=kubernetes config=node-exporter
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.664Z caller=main.go:1261 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=6.861558ms db_storage=1.6µs remote_storage=1.2µs web_handler=500ns query_engine=800ns scrape=3.46318ms scrape_sd=100.802µs notify=42.801µs notify_sd=13.7µs rules=2.858566ms tracing=8.1µs
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.664Z caller=main.go:1004 level=info msg="Server is ready to receive web requests."
Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.664Z caller=manager.go:995 level=info component="rule manager" msg="Starting rule manager..."

From the output, I get this error:


Jun 18 16:24:18 vm-devops-tiim-giro-prd prometheus[1441565]: ts=2024-06-18T16:24:18.661Z caller=manager.go:317 level=error component="discovery manager scrape" msg="Cannot create service discovery" err="unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined" type=kubernetes config=node-exporter

I haven't had any luck with my googling so far... Can anybody help me guide on what variables am I missing in the configuration?

1 Answer 1

0

These were the great suggestions posted by @Dharani Dhar Golladasari on the thread: https://stackoverflow.com/a/78639344/10725312

After reading these, I quickly solved my issue by editing the prometheus.service file and under [Service] I added the values for KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT (as written here):

[Service]
Environment="KUBERNETES_SERVICE_HOST=kubernetes.default.svc"
Environment="KUBERNETES_SERVICE_PORT=443"

Then I just had to restart the service (sudo systemctl restart prometheus.service).

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .