Re: Remote host up/down monitoring tool?



In article <spdti4-ad8.ln1@xxxxxxxxxxxxxxxxxx>,
Martin Gregorie <martin@xxxxxxxxxxxxxxxxxxx> wrote:
Tony Mountifield wrote:
I have a small number of boxes in different locations, and currently have
a fairly crude cron job running on each, which does a ping of one or more
of the other boxes, and if the ping fails, it emails me to say the other
box might be down. It then emails me again the next time the other box
appears to be up.

Of course, this can't distinguish between the remote box really being down
and there being a network problem somewhere between the local and remote
boxes.

I've been mulling over the idea of a more sophisticated scheme, where
a number of boxes send each other messages, indicating not only their
presence, but which other boxes they believe to be up. Then if a box
goes down, the other boxes all see it has gone and agree that it really
is down. However, if there is instead a network outage or routing flap
so that a box is reachable from some places but not all, it might be
possible to distinguish this case.

So my question is: does anyone know of an existing too that does this
sort of thing?

Take a look at SNMP (Simple Network Management Protocol). It may be
considerable overkill for what you want. It can keep tabs on hardware
and software status, free disk space, etc. Its been ported to Linux.
I've never used it - just know that it exists and has been ported - and
that it should be easy to set up to meet your requirements.

SNMP is a service that runs on individual devices (from switches through
routers to servers and fridges) SNMP doesn't monitor stuff as such, but
makes information avalable as a set of named (or numbered) variables for
external programs to read (such as nagios with the right plugin, or one
of the multitude of free and non-free SNMP monitoring programs avalable)

Most SNMP aware devices can additionally broadcast a "trap" when setup to
do so when certian conditions are met. (temperature exceeded, etc.) Some
variables can be written to - eg. there might be a "reboot in X seconds"
variable which you can write to and cause the device to reboot...

So you still need some sort of external software to probe each device (or
listen for traps) and query for various variables via the SNMP interface.

Having been around for a long time, there are several such monitoring
programs - some commercial with bells & whistles, some not so, some
GUI or web based, and other command-line based.

Eg. Try this on a linux box:

snmpget -v1 -c public localhost 1.3.6.1.2.1.2.2.1.11.1

You might need to be root. And this might not work if you've not enabled
SNMP. (You'll get a number that represents the number of packets network
interface 1 has seen)

And while the protocol might be simple, sometimes the actual management
and implementation of it isn't )-: (See all those dotted numbers
above?) The most trivial of applications to use it is probably MRTG -
for graphing network/router/switch interfaces, but different devices
can return different values for many parameters.

Eg. A Cisco 2900 switch may hold over 4000 different "variables"
which you can pick through to your hearts content :)

Linux has had SNMP support for many years, enabling remote clients to
probe a Linux server for various parameters, but you do need to "tune"
it to your own needs - the configuration files are "intersting" :)

Gordon
.



Relevant Pages

  • Re: Anyone Networking there ?
    ... Could anyone out there help me getting my network to run. ... I am new to Linux and NOT a troll!!! ... I have a home office set up with a few windows boxes and one good box ... it runs in a dual boot with an old Windows and my Windows ...
    (alt.os.linux.suse)
  • Re: one printer on home network. How?
    ... I have been using Google to try to find out how I can let all 3 Linux ... All three boxes have these Linux ... the CUPS config includes a stmt to control whether or not network ... enable network access to your printer ...
    (comp.os.linux.networking)
  • Re: [SLE] Basic SAMBA Configuration
    ... I have 2 computers I am trying to network. ... Both computers have OpenSuSE 10.0 installed. ... If one is trying to set up any number of boxes on a network where ALL the ... and Linux then one MUST use Samba. ...
    (SuSE)
  • Network Permission Problem
    ... I have a small network with several boxes running either XP or FC3. ... specific XP box from this one Linux box. ... How could I have lost permission for just this one share as a normal ...
    (Fedora)
  • CERT Advisory CA-2002-03 Multiple Vulnerabilities in Many Implementations
    ... Products from a very wide variety of vendors may be affected. ... Many other systems making use of SNMP may also be vulnerable but were ... Numerous vulnerabilities have been reported in multiple vendors' SNMP ... The Simple Network Management Protocol is a widely deployed ...
    (Cert)