Skip to main content

Questions tagged [system-monitoring]

Questions regarding system monitoring - Nagios, Icinga, Spiceworks, Munin, Zabbix and more.

Filter by
Sorted by
Tagged with
44 votes
4 answers
29k views

Find out which process is changing a file

I'm trying to find a reliable way of finding which process on my machine is changing a configuration file (/etc/hosts to be specific). I know I can use lsof /etc/hosts to find out what processes ...
robbles's user avatar
  • 543
28 votes
6 answers
59k views

How to find out the number of time series stored in Prometheus LevelDB

i'm responsible for maintaining the Prometheus servers in our company. The metrics however are provided by the teams. Is there a way to find out the number of time series stored in the Prometheus ...
Tobias Wiesenthal's user avatar
20 votes
9 answers
121k views

script to automatically test if a web site is available

I'm a lone web developer with my own Centos VPS hosting a few small web sites for my clients. Today I discovered my httpd service had stopped (for no apparent reason - but that's another thread). I ...
Xoundboy's user avatar
  • 613
12 votes
3 answers
6k views

Alternative to etsy/statsd

Is there any alternative to etsy's statsd? Maybe even a complete dashboard-like solution? My research only found proprietary SaaS solutions. For those who do not know: statsd is a deamon which ...
d135-1r43's user avatar
  • 401
12 votes
2 answers
5k views

16TB Volumes and SNMP On Windows

As volumes larger than 16TB became more common, it was recognized that the 32 bit value used to report disk size and usage within the standard "HOST-RESOURCES" MIB in SNMP was not large enough to ...
Univ426's user avatar
  • 2,149
11 votes
4 answers
9k views

How can you distinguish between a crash and a reboot on RHEL7?

Is there a way to determine whether a RHEL7 server was rebooted via systemctl (or reboot / shutdown aliases), or whether the server crashed? Pre-systemd this was fairly easy to determine with last -x ...
kwb's user avatar
  • 173
10 votes
1 answer
177 views

Best way to monitor Windows server? [closed]

I'm working at a company that provides our small business clients with IT support. One of my tasks is to perform service checks which includes checking the event viewer for critical errors/warnings as ...
Jonathan Mayers's user avatar
10 votes
4 answers
4k views

Monitoring Dell/HP Servers Running ESXi (Free)

What are you all doing to monitor ESXi servers that run the free edition? With the lack of SNMP support, it seems fairly limited to me. What'd I'd like to be able to do is get some type of alert when ...
Untalented's user avatar
9 votes
2 answers
19k views

How to monitor power supply status using ipmitool on Linux/Solaris?

ipmitool differs a lot in Solaris and Linux. How can I use ipmitool in these servers (on Sun, IBM and other hardwares) to detect the power supply status?
vrnjain's user avatar
  • 91
8 votes
2 answers
6k views

Load average is greater than the number of EC2 Compute Units

On an EC2 m1.large, with an AVG CPU Utilization graph such as this: how is is possible that the load average is greater than the number of EC2 Compute Units (4) ? cat /proc/loadavg 5.78 5.57 5.44 1/...
Drew's user avatar
  • 235
8 votes
4 answers
58k views

SNMP service security tab is missing - Windows Server 2012 R2 - DC

I have to configure the security settings for the SNMP-Service on a Windows Server. But they are missing! Here are the facts: OS: Windows Server 2012 R2 I installed the SNMP feature and I believe, ...
frupfrup's user avatar
  • 863
7 votes
2 answers
2k views

Agentless monitoring: how does it work? Advantages over traditional monitoring?

How does agentless monitoring work? From what I understood (or not), it seems this is accomplished by logging into the node-being-monitored from a central server and uploading-then-running scripts on ...
sysadmin04's user avatar
7 votes
1 answer
3k views

How to restart and alert if condition matches in Monit?

How can I do multiple things when condition is matched? For example if I want to restart a process and also send alert email. I know I can do it with two separate lines, but can I combine them? if ...
Firze's user avatar
  • 355
7 votes
2 answers
439 views

Green-IT: How do you deal with poweroff systems in your system monitoring?

Many of you probably have completed or are contemplating Green-IT projects with the goal to power off idle or unneeded systems when demand for computer resources is low: How you did handle this ...
knweiss's user avatar
  • 4,075
7 votes
1 answer
868 views

Intermittent munin-cron error “There is nothing to do here, since there are no nodes with any plugins”

We've installed munin monitoring on one of our servers. Generally it seems to be working well but occasionally, 4 times in 2 months to be exact, munin-cron has generated the following error: [...
scarba05's user avatar
  • 333
6 votes
1 answer
8k views

Full status information in Nagios email notification?

I have set up Nagios to monitor my servers and I have written a few custion checks. When I get a notification email, I only get the first line of the status information and I have to use the web ...
Gene Vincent's user avatar
6 votes
1 answer
418 views

Nagios OK notification at the beginning of the availability period

I'm monitoring an application which starts just before business hours and shuts down at the end of the day using Nagios 4.3. I've configured the notification period for it to start 3 minutes after the ...
Isac Casapu's user avatar
5 votes
1 answer
8k views

GCP VM Disk space alert

How can I configure GCPs monitoring suite to look at % disk utilization (in total space used, not IOPs)? The only "disk used" metric I see in metrics explorer seems to chart some kind of units per ...
atxdba's user avatar
  • 337
5 votes
4 answers
2k views

IBM ServeRAID: how to use email alerts?

I just installed a brand old IBM server with a ServeRAID 4Lx card. I installed the driver, and the ServeRAID manager software v9.30. Everyting works as expected. My problem is: Yesterday, when not-so-...
Gregory MOUSSAT's user avatar
5 votes
1 answer
5k views

In Icinga (Nagios), how do I configure hosts with multiple IPs?

I'm setting up Icinga (Nagios fork) and I have some machines with multiple interfaces. Some services are only listening on one of them and to check them correctly, I like to know if it's possible to ...
gertvdijk's user avatar
  • 3,624
5 votes
2 answers
4k views

iotop does not show writes

What could be writing on the disk that iotop does not show? # iotop -a Total DISK READ: 8.19 M/s | Total ****DISK WRITE: 3.34 M/s**** TID PRIO USER DISK READ DISK WRITE> SWAPIN IO ...
Marki's user avatar
  • 2,854
4 votes
7 answers
3k views

Remote Linux server monitoring

I'm looking for a solution to monitor multiple Linux servers remotely. I don't need a whole lot of granular data, just basic things like server load and critical error notifications. I'm no Linux guru,...
findzen's user avatar
  • 151
4 votes
2 answers
210 views

I am looking for a tool to measure or detect "unresponsiveness" of a desktop PC

I have a client that provides some server systems to a hospital, and a support ticket was raised that the desktop application was hanging waiting for the server. We did some extensive testing and its ...
Tom's user avatar
  • 11.4k
4 votes
1 answer
358 views

Find network percentage of NIC [closed]

So I have created a Linux resource monitoring tool that pulls various resource information. One of the fields I am trying to pull is the percent of network throughput on my NIC. So if I have a 1 Gb(...
IT_User's user avatar
  • 210
4 votes
2 answers
758 views

Is Collectd a good choice for gathering system metrics [closed]

I had some experience with collectd a year or so back. I remember being impressed by its speed and flexibility, however it was never adopted as the main source of collecting metrics, cron jobs running ...
Brent's user avatar
  • 65
4 votes
1 answer
1k views

How to monitor GlusterFS clients?

We are doing Ok (we'd like to think) monitoring our GlusterFS servers via Icinga. We'd like to monitor the clients too. Other than making sure, there is a glusterfs process running for each glusterfs-...
Mikhail T.'s user avatar
  • 2,411
4 votes
1 answer
216 views

logstash-forward equivalent for fluentd?

Is there something equivalent to logstash-forwarder that can ship logfiles to fluentd? I am trying to send log files from an application to a remote fluentd but have not seen whether this is ...
adamo's user avatar
  • 6,965
4 votes
3 answers
5k views

Hiding hosts in Nagios

I would like to monitor a few hundred hosts using Nagios, yet I only want the switch fabric to show up in the statusmap.cgi. Is there a way to prevent a host from showing up in the status map, yet ...
TheWellington's user avatar
3 votes
4 answers
2k views

Spawn phone call from EC2 alerts

I have a system setup on AWS/EC2, it currently is using their CloudWatch alert system. The problem is this sends just to email, when ideally I would like this to be making a phone call and/or sending ...
Matt's user avatar
  • 31
3 votes
1 answer
2k views

How to user monit to count the number of instances of a process

Is it possible to use monit to count the number of instances of a process (in my case Celery) and take an action accordingly. For example if there are 4 instances of celery daemon, then take action
aqs's user avatar
  • 163
3 votes
2 answers
1k views

Reporting historical system activity in FreeBSD

I'd like to record data about system activity under FreeBSD for future analysis. If I were running a SysV system, I'd just sar and its related utilities, but that doesn't exist in the BSDs. (And ...
wfaulk's user avatar
  • 6,968
3 votes
2 answers
7k views

How to watch a service with multiple processes with Monit?

I'm trying to watch the mailing list manager sympa with monit. A running sympa instance consists of multiple processes for the different tasks of list management (e.g. a separate process for archiving ...
morxa's user avatar
  • 193
3 votes
1 answer
9k views

Can't get Monit to check status of apache2

I am trying to configure Monit on my local machine to get a taste at how it works, but I have some issues. What I am trying to do is to get any evidence that Monit is up and running correctly and is ...
Andrea's user avatar
  • 133
3 votes
2 answers
8k views

Cannot read status the monit daemon, even with allowed group

I cannot seem to get monit status or other CLI commands to work. I've built monit v5.8 to run on a Raspberry Pi. I'm able to add services to be monitored, and the web interface can be accessed just ...
jefflunt's user avatar
  • 300
3 votes
5 answers
9k views

Windows Services: How to schedule and monitor them?

We have about a dozen Windows Services, both in-house developed and third-party products, and have the following requirements for managing them: Start/Stop/Bounce a given service at scheduled times ...
Daniel Fortunov's user avatar
3 votes
1 answer
2k views

Icinga2 dependecies of devices on HA

I would like to configure a Host-to-Host dependency on Icinga2, however, one of the Hosts has an HA configuration, so I need the to trigger it only when both HA devices are down. Suppose this scenario:...
lithiium's user avatar
  • 205
3 votes
0 answers
236 views

Prometheus not monitoring all EC2 instances of a region

I have set up Prometheus for the monitoring of my AWS EC2 instances, but the issue is taht Prometheus is showing up only 1 instance, however in my AWS instance account there are 2 instances running. ...
huzaifa224's user avatar
3 votes
2 answers
116 views

Incident reporting and logging

I am looking into tool (or advice) that would allow me to track and log all incidents that happen on my infrastructure. We have a few servers (50+) and that number is going to increase in the future,...
Igor Hrcek's user avatar
3 votes
1 answer
2k views

Dynamically setting check_interval parameter based on Service_State in Icinga2

I have a requirement where check interval is 180 mins while notification interval is 10 mins. Means service owner wants if he miss any alert that usually comes after 180 mins if service is critical ...
Manii's user avatar
  • 101
3 votes
0 answers
1k views

How to check what process or application is deleting a file without using Process Monitor? (Windows Server)

Currently I'm having an issue with a piece of software that makes use of specific files (which are basically xml), sometimes stored on a file share and sometimes stored locally. Every so often one of ...
Dan's user avatar
  • 31
3 votes
0 answers
1k views

System Center 2012 Alternatives [closed]

Are there any good alternatives to System Center 2012? I'm looking for a system platform that allows gives us a replacement for SCCM, SCOM, EdnPoitn Protection, DPM and Global Service Monitor and can ...
Corneliu's user avatar
  • 131
3 votes
0 answers
497 views

Munin disable dynazoom.html

Doing a quick google search for "Munin dynazoom.html doesn't work" yields many results. There doesn't seem to be a solution that works -- at least not that I have seen. I have munin installed on a ...
Patrick's user avatar
  • 31
2 votes
2 answers
848 views

Delaying a Nagios/Icinga check

When monitoring the healthy of a server, some faults or warnings are immediately urgent but others only matter if they persist. I'm thinking of things like: Some software needs to be updated Time ...
Marcus Downing's user avatar
2 votes
2 answers
15k views

How to reset the admin password for Observium

How can i reset the password for the user admin with MySQL or a Observium script. MariaDB [observium]> select * from users; +---------+----------+------------------------------------+----------+---...
FaxMax's user avatar
  • 165
2 votes
2 answers
8k views

Nagios check_ssh returns usage information instead of status

I installed Nagios on a Ubuntu Desktop (Nagios server) and I want to monitor a Ubuntu server instance (monitored client). I can connect via SSH between both machines and SSH is not blocked. The nagios ...
Stefan's user avatar
  • 123
2 votes
1 answer
3k views

M/Monit how to see current disk space?

In the admin interface of M/Monit under Reports -> Analytics I can chose to show Space %. How can I make the Monit clients submit this info? Is there a way to display Disk Space percentage on the ...
kev's user avatar
  • 261
2 votes
2 answers
3k views

IBM x3500 Server managament/monitoring tool

I took over monitoring an older IBM x3500 7977 server and i don't have much knowledge of IBM servers. I'm looking for the equivalent of Dell Server Administrator from IBM, just to monitor and alert on ...
Zero Subnet's user avatar
2 votes
2 answers
2k views

Nagios Basic Configuration (for quick addition of new machines)

I recently started to use Nagios to monitor about 25 servers (mainly virtual, with some standalone). Them majority of the servers (including the Nagios host itself) are running Ubuntu 14.04 LTS, with ...
Harsha K's user avatar
  • 123
2 votes
2 answers
8k views

Nagios: turn off service checks/display on down hosts

I want to to tweak nagios in such a way that all checking stops (with services not displayed, or displayed as unknown) for any down node. Said differently I only want to see one alert for a down host ...
Alien Life Form's user avatar
2 votes
2 answers
2k views

Record SSH / Terminal into video?

If someone access to the server via Putty (SSH) or terminal - I want to record everything what they can see on the screen and what they have typed into video.. What is the solution to this, is there ...
I'll-Be-Back's user avatar