Questions tagged [monitoring]

Applications or appliances that observe machines, systems and networks to find problems and notify administrators.

1 vote
1 answer

Prometheus auto scrape metrics from multiple kube-state-metrics in kubernetes?

I want to use a kubernetes(cluster-0)with multiple kube-state-metrics to monitor multiple other kubernetes cluster(cluster-1,2,3,4) In the (cluster-0), I split into multiple namespaces like this: ...
Lê Minh Quân's user avatar
0 votes
0 answers

Is it possible to ensure detection and logging of all attempts to copy data out of a system?

Say I have a server set-up for processing sensitive data. The few authorised users of the system are instructed not to copy any of the sensitive data out of the platform, but could in principle do so ...
Thomas Arildsen's user avatar
0 votes
1 answer

How should "CPU usage per node" be interpreted in Google Cloud monitoring?

In the monitoring tab for Composer (Airflow) on Google Cloud there is a graph showing "CPU usage per node". How should the values in this graph be interpreted? What value would indicate that ...
Niemi's user avatar
  • 101
0 votes
1 answer

How to monitor all the processes from a user in windows server 2012?

I would like to monitoring all the processes which are invoked by a user within a period of time, like a couple of hours, some process may only run less than a second. what is the best way to do? ...
Jim's user avatar
  • 151
0 votes
1 answer

Using "monit" - how to detect empty reply from http process (apache2)

I would like to monitor empty replies from my apache2 process as I am running into a problem similar to "Apache gives empty reply" . I am using monit to monitor my processes, so I am going ...
le_top's user avatar
  • 135
0 votes
0 answers

I want to monitor SSH with Monit but I got an error

I want to monitor SSH with Monit but I got an error. This setup works with my old Ubuntu 18.04 server but it doesn't work on my new Ubuntu 20.04 server : ubuntu@ov-xxxx ~ $ cd /var ubuntu@ov-xxxx /var ...
mathieu's user avatar
0 votes
1 answer

Windows Event Forwarding and Sysmon

I'm dealing with a bit of an issue relating to WEF and sysmon I have the collector server setup and 2 domain controllers are configured via GPO to send events to WEF collector. It is configured via ...
hx.m4v's user avatar
  • 1
2 votes
2 answers

Zabbix low-level discovery - CPU usage per process - two items with identical keys

I'm trying to use Zabbix for monitoring CPU usage by different processes on Windows Server. Processes to monitor are not determined upfront. I want to use LLD to monitor top 3 CPU demanding processes. ...
Paweł Zimny's user avatar
0 votes
2 answers

haproxy, is it impossible to monitor the http total request time?

using haproxy HTTP logs with http-server-close of keep-alives the time counters in the logs (TR, Tt...) are based on the beginning of the TCP connection. Which means only the first request, with ...
NanoPish's user avatar
1 vote
2 answers

use nethogs in a script

nethogs is a great utility to monitor network traffic by process. However, it is "interactive" and not suitable to be used in script... How can I achieve the following using nethogs or ...
xrfang's user avatar
  • 195
0 votes
1 answer

How can I monitor cli commands on one machine and execute them on another in real time?

I'm trying to make a highly available topology, by executing all of the cli commands that have been executed from one machine to another - Doing this will syncronize their configurations. Therefore, ...
Kostadin Krushkov's user avatar
1 vote
1 answer

What monitoring system should I use for offline network [closed]

I'm part of a sysadmin\DevOps team for for an application. Currently today we have about 25 - 40 vms running as different parts of the application in micro services on the openshift container platform,...
Noam Yizraeli's user avatar
1 vote
0 answers

Server Monitoring Tool via file transfer

I am trying to figure out a way to monitor many Windows servers with a monitoring tool like Nagios, Zabbix even PRTG. The challenge is that server are onboard vessels without a reliable internet ...
P.K.'s user avatar
  • 11
0 votes
2 answers

How to set Munin Critical/Warning alerts when value are under a threshold and not over?

I am trying to do a simple alert in Munin checking SW RAID 1 status where a metric of 2 disks is healthy, 1 disk is Warning and 0 disks is Critical. All the Munin monitors I've seen are triggered when ...
Jason's user avatar
  • 121
1 vote
2 answers

Breaking down one prometheus.yml file?

I am using Prometheus for our monitoring and I have a lot of configs (our prometheus.yml main config file is 8000+ lines long). I would like to divide this out into logical groupings so that it ...
PRS's user avatar
  • 11
0 votes
0 answers

monitor CRL expiration dates for multiple nginx servers

Rephrased question: (Not sure it's really clearer) I have a small self written script, that monitors multiple servers. In fact my script just starts periodically tiny smalls scripts and gathers the ...
gelonida's user avatar
  • 279
0 votes
2 answers

Cannot access monit web interface

I just installed Monit on my server. I want to access to the web interface to manage it but the web is not accessible. The machine is an instance in AWS, the port is open. I have tried many ...
svprdga's user avatar
  • 103
2 votes
1 answer

CWAgent Disk Space Alarms

I'm trying to implement an alarm(in Cloudformation) for disk space free using metrics from the Cloudwatch agent and I'm having issues with devices shuffling DeviceID. I encountered this earlier when ...
wronglebowski's user avatar
0 votes
2 answers

Systemwide File Access and System Call Monitoring on Linux?

In Windows land, you can run Procmon (Process Monitor) from Sysinternals, which will show you every File access, Registry Query etc Systemwide (screenshot attached). You can then backtrack to find ...
Patrick Rynhart's user avatar
1 vote
1 answer

How does Windows Resource Monitor report the disk I/O related to virtual memory reads/writes?

In Resource Monitor, under Disk > Disk Activity, a list of files is shown along with the disk read/write B/sec being performed on each. When memory is paged to disk (ie. virtual memory is written), ...
tQuarella's user avatar
  • 140
0 votes
1 answer

What's Azure equivalent of EventBridge working with CloudWatch to consume all alerts?

We're trying to find a way to be notified and consume (using logic apps) all alerts generated via Azure Monitor. It seems that AWS allows that via EventBridge, so: "Amazon EventBridge now integrates ...
mrtworo's user avatar
0 votes
0 answers

How to configure k8s nginx external auth and exclude health check path?

We started using oauth2-proxy as external authentication for some of our cluster infrastructure components. Our cluster is using the ingress-nginx controller and the Ingress resources are configured ...
Moritz Schmitz v. Hülst's user avatar
0 votes
1 answer

Remote monitoring by Nagios Core

I am working on a project using Nagios to monitor a controller that monitors gas leaks, temperature,... remotely. How can a Nagios Core in one city communicate and receive supervision information ...
Vatoch Mr's user avatar
0 votes
1 answer

get agent nodes to show on master node in icingaweb2

I installed icinga2 and icingaweb2 on master node I installed icinga2 on 3 more servers as agent nodes. I used icinga2 node wizard, configured them as agent and allowed them to connect to master node....
ufk's user avatar
  • 333
6 votes
1 answer

create a CloudWatch Alarm when an ECS service unable to consistently start tasks successfully

If I release a new Docker image with a bug to my ECS Service, then the service will attempt to start new Tasks but will keep the old version around if the new tasks fail to start. In that scenario, it ...
Rich's user avatar
  • 744
5 votes
1 answer

GCP VM Disk space alert

How can I configure GCPs monitoring suite to look at % disk utilization (in total space used, not IOPs)? The only "disk used" metric I see in metrics explorer seems to chart some kind of units per ...
atxdba's user avatar
  • 337
1 vote
1 answer

Finding wasteful or over-provisioned pods on a "full" but underutilized Kubernetes cluster

I work on a Kubernetes cluster where, right now, about 95% of the CPUs and 90% of the memory have been allocated to pods. However, according to the Kubernetes Dashboard, the overall instantaneous CPU ...
interfect's user avatar
  • 343
1 vote
1 answer

Which source of sensor readings are most prefered? IPMI, ACPI, or from sensor chip itself?

When monitoring systems for their temperature, and fan speeds, what source of sensor readings is most preferred? I can get all the motherboard readings from both, IPMI and directly from the Winbond ...
J. M. Becker's user avatar
  • 2,461
1 vote
1 answer

smartctl harddisk check doesn't show Attribute

today i install on my linux server the app smartmontools, after testing my hardrive (raid1) he doesn't show the Attribute. After the command smartctl -a /dev/nvme0n1 ,i get the result without ...
beard black's user avatar
1 vote
0 answers

Monitoring SLA/SLO/SLI using Prometheus

I have done much research about monitoring SLI metrics with Prometheus. I have found only how to monitor a cluster using Kubernetes. I'm hoping to find a response here for simple monitoring. I also ...
Hasagiii's user avatar
  • 111
0 votes
1 answer

AWS solution to monitor events from external machines, reported by SNS?

We have a number of robots installed at various locations, and servicing customers. All robots get their instructions from a central cloud database with customer data, and each have an SQS queue which ...
Esben von Buchwald's user avatar
-1 votes
1 answer

Securing System Monitoring wall display PCs

I have several windows machines which drive dashboards on wall mounted displays for system and network monitoring. I would like to be able to secure them from unauthorized access or modification. ...
SlyOne's user avatar
  • 363
0 votes
0 answers

Zabbix active agent can't connect to server - interrupted system call

I'm running the active Zabbix agents on all the servers in my production environment, however two of them aren't able to connect to the Zabbix server. All I've got to go on in the Zabbix logs is ...
MorayM's user avatar
  • 159
0 votes
1 answer

Two nagios instances in the same machine

After days of surfing on the net and trying by myself, I urgently need your help. So, I have Nagios Core 4.4.3 installed in my centos machine that I use to monitor PROD and TEST environnement of one ...
nonely's user avatar
  • 3
0 votes
1 answer

free server monitoring tool for Java based application

I have applied couple of options like Nagios [which lead to problem after installation]-- Apache went irresponsive with lots of segmentation faults child pid 32507 exit signal Segmentation fault (11)...
kah's user avatar
  • 21
0 votes
1 answer

How to filter by status information column in Thruk

I am using Thruk as a monitoring interface. At the top left corner of the page there is a button which opens a stack with filters of which hosts/services etc. you want to apply. You can easily add ...
Ashark's user avatar
  • 346
2 votes
1 answer

Promethius, group_left, and "on" vs "ignoring"

In Issue #2204, one of the Prometheus developers says: principle you should be favouring ignoring over on to produce generic shareable rules... I'm confused how the use of ignoring would ...
larsks's user avatar
  • 45.6k
0 votes
1 answer

Two Factor Monitoring for ETLs

This name, similar to the 2FA security schema comes from a scenario in which I want to be sure periodically that certain ETL triggers are in place. Not only I want to monitor whether certain ...
dinigo's user avatar
  • 101
11 votes
3 answers

Which tool to use when monitoring machines (linux+windows) with one way communication? [closed]

I have 100+ machines which needs to be monitored, mostly linux, but there are some Windows servers too. I want to be informed when the disks are getting full, when the load is high, or a service is ...
jsaak's user avatar
  • 221
0 votes
0 answers

openshift metrics per namespace

Hello how can i provide usage metrics like: - cpu usage per container - persistantVolume fill level - network usage per container/pod of an openshift cluster. But individual per namespace/project ...
GreenRover's user avatar
0 votes
4 answers

Simple host monitoring solution

I am looking for a host monitoring solution for an infrastructure I have to manage. Since this infrastructure is on-premise, I would like to have a client-server architecture, where a client reports ...
Filipe Rodrigues's user avatar
0 votes
1 answer

Error 404 when trying to access Kubernetes dashboard from remote laptop using SSH proxy

I have a remote cluster on a remote private Cloud to which I have only SSH access (no GUI). I started the proxy server with: kubectl proxy --address= --accept-hosts=.* And started a local SSH ...
Karim Manaouil's user avatar
0 votes
1 answer

Nagios - check procs and --metric=elapsed on the same service

After many days of working and searching onn the net, I'm getting back to you as a last chance for help. I'm working actually on monitoring unix process with nagios core 4.4.3 with nrpe . My goal is ...
nonely's user avatar
  • 3
0 votes
0 answers

Easily monitor the CPU % use of a process

I'm have a small ISP and currently we are running an application of a customer that it's creating high CPU pikes. Basically we have chrome tab running and automate a process to stresstest a web ...
user558413's user avatar
1 vote
1 answer

Tomcat RequestProcessor errorCount - what counts as an error?

We have a Zabbix server that reports on Tomcat's errorCount, from the GlobalRequestProcessor. I'm trying to figure out exactly what gets counted in this errorCount. Is it any request to Tomcat the ...
FrustratedWithFormsDesigner's user avatar
0 votes
1 answer

is there something like the stem-and-leaf plot for timeseries?

When wanting to quickly take a look at the distribution of a sequence of values, the stem-and-leaf plot is an incredibly simple, yet powerful tool. It takes a few minutes to teach the computer to draw ...
kqr's user avatar
  • 91
1 vote
2 answers

Best practice for alerts if webpage returns white page?

We're trying to setup monitoring (zabbix), for webapps that return white page of death. Apps are PHP based. From what I know, white screen of death can be caused by number of issues, memory issues, ...
fugitive's user avatar
  • 125
0 votes
1 answer

Enforcing monitoring on AWS resources

We have a couple huge AWS accounts and I've been tasked with implementing guidelines for monitoring resources and ensuring that monitoring is set up for all existing and future resources. Is there ...
blizz's user avatar
  • 1,154
1 vote
1 answer

Asserting on extended information from Nagios's check_mysql

I'm running the check_mysql plugin using NRPE on a remote DB-server, and while I can get satisfactory data on whether or not the server process is working as needed, I see that the plugin outputs a ...
Zalán Meggyesi's user avatar
2 votes
2 answers

EC2 instance running nginx crashes, "connection refused" - how do I monitor for this?

Say nginx on an EC2 instance crashes. The instance is healthy and CloudWatch Metrics are great, but all the domains hosted on the server are now "Connection refused". This seems like a very basic ...
Neal's user avatar
  • 23

