What's New in Version 0.0.6
Note: This is a beta release of NetSaint. Any bugs should be reported to the netsaint-devel mailing list or to me at netsaint@netsaint.org.
Here are some of the things that have been changed or added since the 0.0.5 release...
New Features
- Multiple Parent Hosts. You may now specify multiple parent hosts for each host definition. The order in which you specify parent hosts has no effect on how things are monitored. However, the statusmap and statuswrl CGIs will use the first parent host that you specify as the primary parent for purposes of drawing only.
- Passive Service Checks. Previous to version 0.0.6, the only way NetSaint could check the status of any service was to actively check (i.e. perform the check itself). In 0.0.6, NetSaint can now access service check results from external apps. External apps can submit service check results to NetSaint via the newly added PROCESS_SERVICE_CHECK_RESULT external command. NetSaint will treat and act upon passive service checks in the same way it does "normal" active checks. More information on how passive service checks work can be found here.
- Volatile Services. Service definitions have been extended to distinguish between normal services and newly added "volatile" services. Volatile services differ from normal services in that they get logged, generated a notification, and have an event handler run every time they are in a hard, non-OK state and the result of a service check shows the service to be in the same non-OK state. Volatile services are especially useful for monitoring asynchronous events like SNMP traps and security alerts. More information on how volatile services work can be found here.
- Notification Escalations. Two new types of definitions have been added to the host config file to support optional escalation of service and host notifications. The two new definitions are service escalations and hostgroup escalations. More information on how notification escalations work can be found here.
- Distributed Monitoring. NetSaint can now be configured to do distributed monitoring of your network. More gory details on how distributing monitoring works can be found here.
- Network Outages CGI. A new network outages CGI has been added to help pinpoint the cause of network outages (from the view of NetSaint). More information on how the new CGI works can be found here.
- Trends CGI. A new trends CGI has been added to allow you to view a graph of historical state data for any given host or service over an arbitrary period of time. In order to produce useful results, this CGI expects that you have enabled log rotation and are storing historical log files in the directory specified by the log_archive_path variable.
- Sorting In The Status CGI. This has been requested for some time now, and I finally got around to doing something about it. Service result entries in the status CGI (detail view) can be sorted by host name, service description, state, attempt number, and last check time. Sort orders can also be reversed. In order to sort the entries in the status CGI, click on the arrows located in the table headers.
- Audio Alerts In The Status CGI. If you want to get an audible notification of network problems in the status CGI, you can use the audio_alerts in the CGI configuration file. You're able to specify different sounds to play for services that are in critical, warning, and unknown states, as well as hosts that are in unreachable or down states. If you configure audio files for multiple alert states, NetSaint will only play the sound that corresponds to the most critical problem.
- CGIs Now Use Stylesheets. Everyone has their own idea of how the CGIs should look. I've moved most of the formatting code in the CGIs out to stylesheets. Each CGI has its own stylesheet that you can modify as you like. You'll need to have at least a 3.0 browser to actually be able to use the stylesheets - the output looks fairly dull without any style (duh!). BTW, Netscape and IE look like they both have rather horrid support of stylesheets when it comes to tables. Netscape is probably worse that IE, but they both have their problems...
- State Retention During Restarts. Service and host status information can be preserved between program restarts. This is useful if there are pre-existing problems on your network (at the time NetSaint is restarted) and you don't want to receive initial notifications right away. This option will preserve state information, plugin output, last notification time, and state statistics for both hosts and services. In order to save state information between restarts you must enable the retain_state_information variable and specify a file in which to save the information by using the state_retention_file variable.
- Logging Of Initial States. Initial host and service states can be logged if you find the need to do so. This is useful if you are using an application that scans the log file to determine long-term state statistics for services and hosts. Normally, states are only logged when there is a problem or recovery. You can enable initial service and host state logging by using the log_initial_states option in the main config file.
- Acknowledgement of Problems. Users can now acknowlege host and service problems via the extinfo CGI. Acknowledgements can only be made after a host or service experiences a problem at at least one notification has been sent out. Upon making an acknowledgement of a problem, a comment will be added to the appropriate host or service, an acknowledgement notification is sent out, and future problem notifications will be temporarily disabled until the host or service changes state.
- Command Timeouts. Command timeouts can now be specified globally for service checks, host checks, event handlers, and notifications. Timeout values are controled by the service_check_timeout, host_check_timeout, event_handler_timeout, and command_timeout options in the main config file.
- Macro Changes. This one is important. I've changed the $SERVICESTATE$ And $HOSTSTATE$ macros to reflect the actual state of the service or host during recoveries, instead of setting the macro equal to "RECOVERY". For service recoveries the $SERVICESTATE$ macro is set to "OK" and for host recoveries the $HOSTSTATE$ macro is set to "UP". This was an inconsistency which had been annoying me for a long time, so I decided to change it and be done with it. Make sure to modify any event handlers you have that use the state macros! Also, a new macro ($NOTIFICATIONTYPE$) has been introduced, which can be used to identify what type of notification is being sent out. Values for the macro include "PROBLEM", "RECOVERY", and "ACKNOWLEDGEMENT". The $SUMMARY$ macro has been removed - at some point it stopped working and I just decided to kill it off. The $OUTPUT$ macro can now be used in host notifications as well as service notifications. When the $OUTPUT$ macro is used in host notifications, it will contain the text returned from the host check command. Lastly, user macros ($USERn$) can now be defined in an optional resource file and can be used to hide information like usernames and passwords, or to store information commonly used in command definitions (like directory paths). More information on macros can be found here.
- Change In Location of CGI Config File. The CGIs now expect that the CGI config file (nscgi.cfg) resides in the same directory as your main and host config files (usually /usr/local/netsaint/etc). This was done to make things a bit more consistent and make it easier for creating RPMs.
- Developer Documentation. I've added a new section to the documentation for developers who are wanting to interface third-party apps with NetSaint or exploit some of its internal capabilities (which are not yet available through the config files). Documentation is provided on the format of the various files that NetSaint uses, as well as internal functions which can be used to extend NetSaint's ability to read/save configuration information. I'll keep this information updated throughout the various releases of NetSaint. The developer documentation can be found here.
- Internal Overhauls. A lot of the internal code in the core program and CGIs has been overhauled. End users won't see a difference, but it makes the code easier to work with. Some of the changes that have been made include changing static buffers in the data structures to use dynamically allocated memory, an overhaul of the internal logging code, and shared data structures and functions between the core and CGIs.
- The Usual Bug Fixes. Would any new release ever be complete without bug fixes from previous versions?