Questions tagged [monitoring]
Applications or appliances that observe machines, systems and networks to find problems and notify administrators.
2,470
questions
0
votes
0
answers
8
views
Zabbix monitoring JSON-File
I want to monitor a JSON-File looking something like this.
{
"data": [
{
"ID": "112940",
"Type": "type1",
...
0
votes
1
answer
32
views
When to use Thanos/Cortex over Vanilla Prometheus?
Generally when I look around or try to understand the HA setup for Prometheus the most common search results are Cortex & Thanos. I've also seen M3 and Victoriametrics on the list.
What I ...
0
votes
0
answers
10
views
How to log each virtual machine resource usage cpu time, mem etc
Running Apache with mpm_event on a shared linux server with about 5 websites (VMs)
I would like to log each VM's usage of the memory, cpu time and bandwidth for each 24hr period and have it email to ...
0
votes
0
answers
23
views
Can't set warning / critical thresholds in munin plugin-conf.d file
For some reason, I cannot get warning thresholds to work on any of my munin installations on ubuntu. For example, one of the provided plugins I have working is the netstat plugin. Now to set the ...
0
votes
0
answers
25
views
monitor pgp expiration date from ciphermail gateway
I can lookup the expiration date of my public keys (PGP) in the webui of ciphermail.
But is there a more convenient way of monitoring the expiration date of the PGP public keys?
0
votes
0
answers
47
views
Monitor Docker containers on EC2 instance from Splunk Enterprise
I need to monitor the Docker containers running on my EC2 aws machine on Splunk enterprise.
I have installed Splunk and added Splunk forwarder to my EC2 machine and I am running a Docker httpd ...
0
votes
0
answers
11
views
[Monitoring and Alerts Configuration for Google reCAPTCHA v3 Enterprise Edition]
0
I am reaching out for technical assistance regarding our implementation of Google reCAPTCHA v3 Enterprise Edition. We have successfully created keys for our sites and are now looking to set up ...
0
votes
1
answer
86
views
How can I monitor Glue jobs that are fired by EventBridge?
My stack is as follows:
EventBridge fires a Glue job at a regular interval.
Said Glue job runs Python scripts, which run as Step Functions.
The output of these scripts is saved to S3.
How can I ...
0
votes
0
answers
17
views
Entuity (monitoring SW) databse backup
If anyone is using quite obscure monitoring system named Entuity, I'm trying to find a good way of extending its database.
Currently the system is using MySQL database. For some reason its setup to ...
-1
votes
2
answers
109
views
Is there a free rocky linux monitoring tool? [closed]
I am currently looking for a tool to monitor the performance of two Rocky Linux desktop computers. It should monitor aspects such as the CPU, RAM, disk, and network usage (similar to the Windows Task ...
0
votes
0
answers
390
views
cannot process received configuration data zabbix proxy
I have installed and configured a Zabbix proxy version 6.4.13 and I am trying to add it to a zabbix server version 6.4.
The point is that, the zabbix servers catches the zabbix proxy and the last seen ...
0
votes
1
answer
37
views
How to query Azure quotas and limits in Azure Monitor?
Microsoft Azure subscriptions are subject to a variety of quotas and limits, for example number of total CPU cores and number of VMs of a certain family.
How can I query the current quotas and actual ...
0
votes
0
answers
187
views
Preprocessing failed for performance counters zabbix, MSSQL server
I've connected my MSSQL server to zabbix via ODBC.
I'm receiving results for some items but for other items errors are returned:
Preprocessing failed for: [{"object_name":"SQLServer:...
0
votes
0
answers
85
views
Adding Aruba switch to monitoring tool via SNMP protocol
I am trying to add few aruba switches to a monitoring platform observium but failing to do so.
the coumminity string and the trap address are perfectly fine but it doesnot seem to work. It would be ...
0
votes
0
answers
58
views
Trouble Configuring Zabbix Exporter with Prometheus for Grafana Dashboards
I'm currently attempting to set up Zabbix Exporter to fetch Zabbix metrics into Prometheus for visualization in Grafana dashboards. My goal is to utilize Grafana's features for creating insightful ...
0
votes
0
answers
22
views
Free/total and when free= 10% of total color turns red
I am using Grafana and I am trying to display the values of "Free space/Total size" as actual values rather than calculations. I would like to create a value mapping so that when the free ...
0
votes
0
answers
92
views
Kubernetes monitoring with Prometheus
I need a little bit of help here. I have a Kubernetes cluster up and running and I have a dedicated machine for monitoring with Prometheus running on it. I already have node exporters running and ...
0
votes
1
answer
104
views
How to create ops-agents policy that will install ops-agent on all Ubuntu 22.04 VMs?
I have created this ops-agent policy to install ops-agent on all my running Ubuntu 22.04 VMs, but the policy exists in the Cloud console when I navigate to Monitoring -> Dashboards -> VM ...
0
votes
2
answers
284
views
What could help to find what is causing sudden 100% CPU usage hanging my VPS?
I have a VPS I manage on my own. There are running just as much as a few Node.js projects, docker projects, crowdsec. The usual CPU load is about 20%.
Occasionally server's CPU usage skyrockets to 100%...
1
vote
4
answers
758
views
How to monitor RAM consumption on Red Hat 7.9
Red Hat 7.9 server with 512 GB RAM.
We often have alerts about swap being full. Swap is often used 99%. Our server admin told us it is normal for linux to have swap used 100%. There is no way to check ...
1
vote
1
answer
386
views
Prometheus Blackbox http_2xx returning 403 since webserver migrated to StackCP
I had working Prometheus Blackbox Exporter http_2xx checks monitoring various web servers. Then the web hosting provider migrated from cPanel to Stack CP.
Since then all the http_2xx just return 403 (...
0
votes
0
answers
207
views
How to be sure to keep all the logs regarding a MySQL table data but filter Google Cloud SQL maintenance queries logs made by localhost GCP SQL?
I enabled the flag General_log for a Google Cloud SQL MySQL instance to get all the queries logs in Google Cloud Logging. I get all the queries users make, but also all the queries made to probe and ...
0
votes
2
answers
144
views
Is it safe to expose monitd http server to the world
I'd like to use monitd for monitoring my webserver. I read it has built in http server. By default it is set to run on 2812 port. Is safe to open the port on firewall and view it via a browser?
0
votes
0
answers
24
views
Running Linux commands execute hidden command to regenerate Backdoor [duplicate]
My CentOS server compromised, the backdoor uploaded in /var/www/html/, I have deleted the backdoor and browsed the backdoor - to be sure it's deleted - it's surly deleted, but when I run any command ...
1
vote
1
answer
589
views
Mystery "Failed to locate executable"
I've been setting up a monitoring solution for various servers using Promtail, Loki and Grafana, following this article. I got a monitoring machine running Loki and Grafana (on Rocky Linux 9.3) and a ...
0
votes
0
answers
39
views
IIS curretly processed requests counter
I am experiencing issue with my application and suspect it might be because I am testing on client Windows/IIS which has 10 concurrent request limit. Is there a way to monitor the current value of ...
0
votes
0
answers
20
views
Monitor outgoing connections with timestamp and process [duplicate]
Actually, I have a CentOS 8 server which is executing brute force attack to other servers. But I have no idea what application or process is performing this attack.
I wish to know if there's any tool ...
0
votes
1
answer
274
views
In a cloud environment, should we alert on high CPU utilisation or high load avarage?
What is the best practice for monitoring the system, should the CPU alerts be based on the regular CPU usage or load average?
I'm wondering what approach is being used in big cloud environments.
0
votes
0
answers
89
views
Error trying to get disak space usage from discovered VMS using Zabbix
I am trying to create a trigger on a template called Template Virt VMware Guest in order to extract the % of free space of the VMWARE VM that has the template inherited.
What I did was:
Going to the ...
0
votes
0
answers
223
views
Google cloud storage bucket [GCS bucket total bytes]- metrics missing for a specific bucket
Update: The metrics appeared after more than 24 hours
I have a lot of buckets in my GCP project (some regional and some multi-region) and I'm able to get the metric: 'storage.googleapis.com/storage/...
0
votes
0
answers
371
views
Datadog not collecting logs from file in Kubernetes cluster
I am trying to configure Datadog agent on AKS Cluster and to read logs from file location at /var/log/datadog/messages.log in each service pod.
It is streaming all the metrics except logs from file ...
0
votes
1
answer
108
views
Monitor of Azure Express Route
I would like to monitor our Express Route. There is a good description on how to do this on Microsoft learn: https://learn.microsoft.com/en-us/azure/expressroute/how-to-configure-connection-monitor
...
0
votes
0
answers
53
views
How is PRTG's `channelDiscovery` intended to work?
Developing a (in PRTG-speek) "custom advanced REST sensor" providing XML, I'm facing some problems that the documentation does not really answer.
First let's have a look what one of my ...
0
votes
0
answers
50
views
Application deployment and monitoring of NodeJS applications
I've been looking at how to deploy NodeJS applications and how to later monitor them. I think my terminology here might be a bit off so I'll explain what I want.
I have multiple VMs running and on ...
-1
votes
3
answers
221
views
Monitor web server directories for changed / new files
TL;DR: Is there an easy why to monitor directories for new/changed/deleted files?
Details: A simple WordPress website on a virtual server got hacked. Nothing too serious. No important project / data ...
0
votes
0
answers
342
views
Why Kube state metrics only shows metrics related to the namespace where it is running?
I have AWS EKS cluster with kube-state-metrics installed in a namespace called "monitoring". This installation is using service monitor and other components (see yaml files below).
In this ...
0
votes
0
answers
153
views
Correct way to setup a multinode LGTM stack
I have a 4 nodes clusters:
MonitoringCenter, hosting
a Grafana, connected to all Prometheus and Loki instances + local AlertManager
a Prometheus scraping local NodeExporter/AlertManager/Loki/...
0
votes
0
answers
185
views
How to add configuration for fluent-plugin-prometheus in Fluentd deployed via Fleet in Rancher?
I'm using Rancher to manage my Kubernetes cluster and have added a logging system (cattle-logging-system) via Fleet. I now need to add monitoring for Fluentd using the fluent-plugin-prometheus.
Here's ...
0
votes
1
answer
85
views
Command not running in Nagios
Can anyone clarify for me why the following code is not running when I place it in command_line for a custom Nagios command? It works when I run it in terminal.
command_name notify-host-by-sms
...
0
votes
0
answers
214
views
Spike in Cadvisor container_network_receive_bytes_total Metric in a Kubernetes Cluster
Summary:
I'm using Cadvisor with Prometheus in multiple Kubernetes (k8s) clusters to monitor network traffic usage. I utilize the container_network_receive_bytes_total metric in a query to calculate ...
0
votes
0
answers
285
views
Port 80, 443 and 2875 monitoring in zabbix
Im monitoring port 443 and port 80 on zabbix With simple check and i put
List item
[http,,443] instead of [https,,443] is that will perform the correct monitoring for port 443??
What will happened ...
0
votes
0
answers
38
views
Gravitee 3.20 healthcheck locations
Maybe someone had a similar issue with healthchecking gravitee helm deployment - cannot find any endpoinds for the healthchecks. I had asked chatgpt and found documentation for it, but neither ...
0
votes
1
answer
358
views
Systemd CGroups - where are logs for exceeding resource limits?
By default systemd assigns resource limits through CGroups like TaskMax, here's a example of this:
$ systemctl status sshd
● sshd.service - OpenSSH Daemon
Loaded: loaded (/usr/lib/systemd/system/...
0
votes
0
answers
57
views
Identifying processes causing 100% cpu usage [duplicate]
Some users report high CPU usage on their computers during ordinary activities (like Teams calls).
How can I collect a report of processes causing high CPU usage over a few hours?
I would like to use ...
0
votes
0
answers
48
views
Top IPs used output traffic for last month - Debian
I have server running Debian 11 and it is a webserver, I need to know in a specific period what is the IP that utlize more traffic out from the server, I find some tools like iftop and vnstat, but I ...
0
votes
0
answers
240
views
kube-prometheus-stack redundancy across multiple clusters
I currently use kube-prometheus-stack to monitor several kubernetes clusters. Each cluster has its own deployment of the kube-prometheus-stack, however, there is currently only one cluster (a) that ...
2
votes
2
answers
89
views
Online monitoring / dashboard service for multiple metrics
I know there are services like Uptimerobot and Pingdom that monitor web service uptime.
I'd like to easily monitor web availability as well as backup status, eg if a backup ran/ completed properly. I'...
0
votes
0
answers
896
views
Communication failure between agent and Zabbix Docker server
Currently I have the following scenario:
First server: I have the Zabbix components: server, gateway and web interface (along with a mysql database), all running via Docker containers (released port: ...
0
votes
0
answers
50
views
Script to scan log file
I have a log file with the content below. Let's call it as cpu_usage.out:
2023-04-12 12:04: CPU STATISTICS CRITICAL : USED:- 2.52% IDLE:- 97.49%|CpuUsed=2.52;0;1 CpuIdle=97.49;0;1
2023-04-12 12:05: ...
0
votes
0
answers
357
views
Causes of packet loss on multiple persistent tcp connections simultaneously?
The issue was detected while analyzing some application logs, which reported few seconds long spike periods when messages from multiple clients are received on the server with a substantial delay (up ...