Questions tagged [monitoring]
Applications or appliances that observe machines, systems and networks to find problems and notify administrators.
436
questions with no upvoted or accepted answers
8
votes
1
answer
2k
views
How do I get an aggregate of all the stats from all uWSGI vassals using uwsgitop?
TL;DR
Can anyone tell me how I may get uwsgitop to monitor all of my vassals in my emperor-vassal setup in one shot?
I have an emperor-vassal setup for my uWSGI server, and I need to monitor all my ...
8
votes
1
answer
19k
views
God Process Monitoring - CentOS - Event System Not Found
I have god installed on at least a dozen (or more) servers running CentOS 5.5 in both i386 and x86_64 flavors that work perfectly. I just setup two new CentOS 5.5 x86_64 servers and installed God, but ...
7
votes
1
answer
868
views
Intermittent munin-cron error “There is nothing to do here, since there are no nodes with any plugins”
We've installed munin monitoring on one of our servers. Generally it seems to be working well but occasionally, 4 times in 2 months to be exact, munin-cron has generated the following error:
[...
6
votes
1
answer
925
views
OpenNMS check if service is running on remote server using SSH
I have an OpenNMS system configured and up and running.
I have a few Linux (debian) servers and I need to monitor if a specific service is running on them. This must be done using only ssh access. No ...
4
votes
0
answers
1k
views
Cacti gets data but does not show a graph
I use Cacti to monitor the call load of a Squire MG1000.
I have created a graph using the SNMP Generic OID template.
In the graph overview, I also see data for current, average and maximum calls but ...
4
votes
1
answer
973
views
Running a custom script/exe through the nsclient++ web server
I have nsclient++ installed on my windows 8 machine and I have include a custom .net exe to do some application monitoring for me. The nsclient++ has no issues running the exe and send the result to ...
4
votes
0
answers
750
views
What does "C:0" mean in file path?
I'm trying to troubleshoot high disk I/O that I can see through SQL Server resource monitor. I opened resource monitor to understand what files were causing high disk I/O but path in file column made ...
4
votes
0
answers
487
views
How to test that an exported resource in puppet exists after collecting?
I want to create a nagios_hostgroup if some hosts exists. For creating hosts in nagios I am using exported resources which I collect on the monitoring server.
How can I test that a exported ...
4
votes
0
answers
655
views
monit reporting “connection failure” when actually a “content failure”?
I have a monit rule which looks like this:
check host example.com with address example.com
if failed url http://example.com/status
content == "ok"
then alert
group example.com
But when ...
3
votes
0
answers
5k
views
Explain of OOM killer logs
I have a question about the OOM killer logs. We are expecting a lot of OOM kills.
The ecosystem
My ecosystem looks like below:
I have a server with 4 cores and 8 GB of RAM.
I am running there the ...
3
votes
0
answers
236
views
Prometheus not monitoring all EC2 instances of a region
I have set up Prometheus for the monitoring of my AWS EC2 instances, but the issue is taht Prometheus is showing up only 1 instance, however in my AWS instance account there are 2 instances running. ...
3
votes
3
answers
4k
views
NRPE not working after NRPE plugin upgrade from v2.15 to v3.2.1
I'm migrating our Icinga 2 from Debian 8 to Ubuntu 18.04. The old server had NRPE plugin 2.15. The new server has NRPE plugin 3.2.1.
If I try to connect with the new plugin to old NRPE servers (v2.15 ...
3
votes
2
answers
116
views
Incident reporting and logging
I am looking into tool (or advice) that would allow me to track and log all incidents that happen on my infrastructure.
We have a few servers (50+) and that number is going to increase in the future,...
3
votes
0
answers
57
views
Monitor latency by IP on a bridge-enabled host
Is there a tool which captures packets, measure latency in realtime and display it with some IP filtering options for Windows ?
My particular setup :
Internet -> Router -> Host Master -> Host Slave
...
3
votes
0
answers
1k
views
Nagios, nginx and external commands - not authorized
So we're going from Nagios 3 to Nagios 4, and as we've been a bit behind on our hosts, we wanted to do a fresh start.
The setup I went for was:
Debian Jessie
Nagios 4.1.1
Lilac for configuration
...
3
votes
1
answer
1k
views
Munin: some pluggins stopped working after moving to different *virtual* host
I used munin as monitoring serfice and I was happy with it. It is located on localhost machine with CNAME being server.domain.com. Trying to access server.domain.com from remote machine showed up ...
3
votes
0
answers
1k
views
How to check what process or application is deleting a file without using Process Monitor? (Windows Server)
Currently I'm having an issue with a piece of software that makes use of specific files (which are basically xml), sometimes stored on a file share and sometimes stored locally.
Every so often one of ...
3
votes
1
answer
1k
views
remote Powershell script Scheduled task
I am having trouble running a Powershell script as a scheduled task. The script remotely logs into two Hyper-V hosts, queries the replication status and emails the result back to me.
The script works ...
3
votes
1
answer
960
views
What are some good ways to identify nfs clients that cause high load on nfs server
Sometimes my nfs4 server is under high load. Is there any good tools identify which client causes it. running iotop and nfsstat on nfs server shows only general load information which does not help to ...
3
votes
0
answers
1k
views
Collecting amavis stats with SNMP
I'm trying to get values from amavis in a CentOS 6.5 server.
This values, such as spam mail (total, total/h, percent), are supposed to be accesible via SNMP.
But, when I run snmpwal, I always get 0 ...
3
votes
1
answer
295
views
Why I get different network traffic values from dom0 and from domU?
I'm using Xen 4.0.1 with Linux 2.6.32-5-xen-amd64 (standard packages on a Debian Squeeze system).
From Xen Networking:
For each new domU, Xen creates a new pair of "connected virtual ethernet
...
3
votes
4
answers
2k
views
collectd:Monitoring server not showing clients
I have setup a monitoring server with the following setup.
<Plugin network>
Listen "0.0.0.0" "25826"
</Plugin>
Now my clients are sending data to the monitoring server(verified through
...
3
votes
1
answer
564
views
archive and visualize amazon ec2 cloudwatch metrics
How do you backup your ec2 cloudwatch metrics? How do you visualize different measurements with different scales at once, like cpu% and i/o? How do you combine your application server metrics (like '...
3
votes
2
answers
215
views
Automated Syslog Error Solution Finder
Any automated syslog solution finding frameworks? I want my central syslog server to email a list of problems, their severity and suggested solutions.
There have been several questions about ...
3
votes
0
answers
470
views
How to monitor Flash applets?
We created a fancy Flash application for a customer and deployed it. The server itself is monitored by the hosting company, but is there any external monitoring service that works well with Flash ...
2
votes
0
answers
723
views
Clean old release files in Sentry
I'm self-hosting Sentry 8 and /var/lib/sentry/files grew to a significant size. I tried launching a script to go through each project's releases via the API, select those older than X days, and remove ...
2
votes
0
answers
1k
views
check_mk logwatch monitoring log with date in filename on windows
In windows, is there a way to setup dynamic dates on logs filenames monitoring using check_mk_agent's [logfiles] section??
In linux I know we can use $(date +%Y%m%d), but don't know if it works on ...
2
votes
0
answers
84
views
How to find which processes caused the system slowdowns? To be used after reboot
For the last few days my Ubuntu 16.04 suffers random slowdowns that it prevents the apps from loading and doesn't even allow me to connect via SSH.
The only way I can resolve it is by rebooting the ...
2
votes
1
answer
779
views
How can I make monit poll more often during a state change?
I'm using Monit to monitor various processes that need to be up and running as a group for a web site to work properly. To bring up or bring down the site, there's a definite order by which the ...
2
votes
0
answers
431
views
How to discover SNMP devices
I'm looking to set up a monitoring system and was reading quite a bit about SNMP. I don't yet have practical experience with it.
I didn't quite understand how an NMS discovers SNMP capable devices.
...
2
votes
0
answers
448
views
Configure Munin to monitor multiple instances on one node
I need to configure Munin to monitor multiple instances of a server daemon that run on a single host.
I've tried creating a different named symlink for each instance, but unless I also pass an env ...
2
votes
0
answers
56
views
Service Availability monitoring on AWS
Is it possible to display Service Availability (%) for metric i have on AWS, I am able to get graphs, but can't get availability for period selected.
2
votes
1
answer
2k
views
Centreon Installation on RedHat 7.
If anyone has experience of installation of centreon on Redhat your help is much needed.
I'm following Guide
https://documentation.centreon.com/docs/centreon/en/latest/installation/from_packages....
2
votes
1
answer
821
views
Nagios check_udp_ports returning critical: result to scheduled check, runs fine manually
We use a Nagios core 4.3.2 solution on Ubuntu 14.04 to do simple host check monitoring on remote client equipment. One type of device we use is not availible to ping, but as part of it's proprietary ...
2
votes
1
answer
2k
views
Generate graph in Grafana from API
I'm looking for a way to generate an arbitrary graph from the Grafana API, ideally by just feeding it a query.
After looking in the doc I don't see anything to do it directly, so the only way I can ...
2
votes
0
answers
115
views
Make Nagios stack alarms when no connectivity to HP OVO
I'm implementing a small Nagios instance on a dedicated laptop to monitor some telcom devices.
The alarms have to be sent via our customer's reporting tool (like HP OpenView) via SNMP Traps. Nothing ...
2
votes
1
answer
2k
views
linux network monitoring, average MBps each 1hr
I want to monitor the average network usage for my Debian server.
Ive tried to mess with dstat, ntop and couple other programs but nothing seems to work like I want to.
Basically I want a program/...
2
votes
0
answers
60
views
Monitoring bit level storage changes
I'm having an issue with drastic differences in back-up sizes on specific days, which also prevents us from having a reliable off-site backup system over the internet. What normally is a +-10GB backup ...
2
votes
1
answer
2k
views
Check_MK: How do I create Notifications based on groups of services instead of just one service?
I'd like to be able to create a notification that alerts based on the availability of a group of services, instead of just one threshold. For example, say I have 10 AWS servers that all do the same ...
2
votes
0
answers
145
views
Can Ganglia's gmond be configured so that it doesn't need to be restarted if the Ganglia server is restarted?
I use ganglia to monitor my computer cluster:
(source of the image)
When I restart the Ganglia server, the gmond daemon on the other servers in the cluster stop sending information to the Ganglia ...
2
votes
0
answers
660
views
monit: add NOALERT in an IF construct
On debian jessie, I have configured monit to check for HAproxy particular port forwardings, and to restart it if failed like this:
check process haproxy with pidfile /run/haproxy.pid
group www-...
2
votes
0
answers
438
views
Is it possible to set a default time period in Cloud Watch?
I'm using Amazon Web Service's Cloud Watch tool to monitor server performance. Whenever I reopen our various Dashboards the time period is always set to the last 3 hours. For some dashboards this is a ...
2
votes
1
answer
927
views
Monitoring /proc/net/udp
I am interested in tracking changes in /proc/net/udp, particularly in the "drops" column. I want to know the the approximate time when a drop counter goes up. Also, if the socket gets destroyed, I ...
2
votes
1
answer
2k
views
Observium > Cannot retrieve processor/memory data from host
I have successfully setup an Observium monitoring server on a AWS EC2 instance to monitor other EC2 instances running Ubuntu.
Hosts has been added successfully.
I can run
snmpwalk -v1 -c public 171....
2
votes
0
answers
103
views
Icinga Performance issue when using many service groups
We have an Icinga installation with more than 3k active service checks. Performance is acceptable. We're already using the use_large_installation_tweaks option.
We have now started to build a ...
2
votes
0
answers
606
views
NagiosXI- Configuration verification failed
I am using check_ping command to check connection statistics of remote host. When I run the above command on command line, it gives me proper output, shown below:
Syntax:
/usr/local/nagios/libexec/...
2
votes
1
answer
845
views
how to efficiently monitor system stat using vmstat?
Am getting the real-time memory stats from vmstat command. I did this using following steps:
$ nohup vmstat 60 > vmstatrecord.app &
the command executes in background and writes the log to ...
2
votes
1
answer
1k
views
monitoring error rate with monit
Is there a way to tell monit to alert me if there are more than X errors (e.g. lines matching "ERROR") in a log file in a certain time?
My use case would be: errors sometimes appear in my log file (i....
2
votes
0
answers
742
views
How to monitor clustered tomcat 7 with javaMelody
I would like to monitor my server health. I have deployed my web app in tomcat 7. I am using 4 instances of Tomcat 7 running on the same machine. individually I am unable to monitor each tomcat ...
2
votes
0
answers
510
views
How to make Statsd talk to Ganglia on EC2 (localhost)
As per topic, Im just trying to make this simple setup work. The services are running fine, but as far as I can tell statsd doesn't send anything over to ganglia. Ganglia is working fine I guess since ...