4. MonitoringΒΆ

SANET tackles the challenge of providing the best possible support for networks management and troubleshooting activities and the most flexible and customizable configuration mechanisms. Pursuing this goals SANET shows advanced features that makes it a valuable tool in the world of network management:

  • Maximum flexibility in the checks management. Every check is customizable according to many parameteres. In particular:

    • Frequency: usally the most important checks are executed at intervals of tens of seconds, while more deep (and invasive) checks can be executed less and less frequently.
    • Tolerance: that is, the possibility to define if a given check has to immediately raise an alarm, or if it just has to be logged and raise an alarm only if it persists for more than a given time interval.
    • Notifications: each check’s failure can be notified via email and/or SMS. The subject and the body of the emails, as long as the SMS body, can be customized separately for each given check.
  • IPv4 and IPv6 reachability checks. Sanet allows to check a given host reachability both in IPv4 and IPv6, with customizable packet size (usually set in order to produce a 1500 byte large IP datagram). It is also possible to measure statistics concerning RTT (minimum, maximum, average) and packet loss, and to represent these data through specific charts. If many IP addresses are bound to the same name, SANET checks all theirs reachability by automatically reshuffling the address set.

  • Dependency among checks. The performing a given checks might be bound to the successful state of other checks it depends on. In this cases the check of interest will not be performed if the checks it depends on are not successful. As an example: if the router of a given site is unreachable it makes no sense checking the switches and other site’s appliances since the will clearly be unreachable. Mechanisms like this allow to improve the performances, reducing the number of redundant notifications (very useful especialli for SMS alerts) and to show immediately where the problem is originated.

  • Interfaces detection flexibility. People dealing with network management know how many problems may rise when trying to detect a node’s interfaces through the so called ifIndex (that is the instance in the 1.3.6.1.2.1.2.2.1 MIB2 of the interfaces table), since such number is not strictly bound to physical interfaces but might change when rebooting, when some hardware is changed (insertion or removal of modules), or according to the firmware version.

    Similar problems arise with other MIB branches: in exmaple with the interfaces in the bridge MIB (1.3.6.1.2.1.17.2.15.1), with servers’ filesystems (hrStorage, 1.3.6.1.2.1.25.2.3.1), with the RAM on Cisco IOS appliances (1.3.6.1.4.1.9.9.48.1.1.1), with the running processes on a server (1.3.6.1.2.1.25.4.2.1), and in many other cases.

    SANET defines a felxible mechanism to detect instances in generic tables according to many possible different criteria, allowing to use such instance numbers in checks and in quantitative measuring, hence it is possible to monitor an interface according to its name, its IP or MAC address, a substring of the IOS description, etc.

    The poller process performs the walk automatically to determine the correct instance and saves the walk results in a cache, this means that in stationary conditions it is possible to obtain an automatic and immediate update without any strong increase in the SNMP traffic.

  • Ping flap dampening. Some checks might happen to continuously oscillate between different states, such phenomenon might be caused by “almost working” links, partial hardware damages, etc.

    Traditional monitoring systems in such cases produce an annoying (especially in the case of SMS notifications) long sequence of alerts and notifications.

    SANET gives the possibility to turn off such notifications, using an algorithm inspired from BGP route flap dampening, giving each check a score that increases at any state change and decreases exponentially with time (with customizable halflife) when no state change happens. It is then possible to define two penaltiy tresholds: one (higher) to turn off notifications and one (lower) to turn them back on. When a check’s notifications are turned off, the check continues to be performed and logged periodically, only email or SMS notifications are suspended.

  • Check functions’ utilities. through the Poller process, SANET provides many general purpose utilites that might be combined to preexistent checks and might use present or past SNMP variables. In example there exists a function to check that an ethernet interface is in full duplex moode, by trying the various standard and propetary MIBs where such information might be included. Another function checks if NTP servers are effectively synchronized.

    Other functions migth perform operations and evaluate aggregates on SNMP tables (i.e. to check the average CPU occupation without specifying how many CPUs there are), check for a given TCP port to be open, a given URL to be available and matching (or not) a given pattern, in order to build logical conditions according to checks results, etc.

  • Timetable management for checks and notifications. It is possible to define timeslices for the checks, according to the local time, day, and weekday. This allows to run checks that needs to be performed only at specific times, in example to monitor offices that switch off power supply when closing, etc. It is also possible to customize the notifications recipients according to timetables, in example to notify a [presidio] when present, or a [reperibile] otherwise.

Previous topic

3.3. Part 3: Monitoring and customizing controls for your first Microsoft Windows node

Next topic

5. Network representation: application resources

This Page