What’s New in NSClient++ 0.5.0

The main goals of the 0.5.0 is t make NSClient++ easier to use and more future proof.

  • New and improved WebUI
  • Metrics support
  • Improved clients and better support for graphite et.al.
  • Fixed count in filters which was not working before
  • Added floating point number support to filters
  • Numerous bugfixes and minor enhancements

PLEASE NOTE

NRPE has been changed in 0.4.3, for details please see the 0.4.3 changelog.

New and improved WebUI

While I really loved the new WebUI in 0.4.3 it is much a proof of concept. I have in 0.5.0 reworked this to be much more feature complete and usable but more importantly (for me) better architected. The WebUI is now I think ready for prime time and will probably make configuring NSClient++ much simpler. A side effect of the new WebUI is the REST support which NSClient++ now has meaning you can now check/configure/* NSClient++ using a REST API.

Metrics

While metrics might seem like a dumbed down check command (as it lacks the checks) it is actually the foundation of there next generation of NSClient++. But for end-users the main goal right now is that Graphite and CollectD becomes much simpler. In the future it will also allow for checking compound metrics and other advanced things which is difficult to do today.

Improved clients

The client for instance check_nrpe was a bit buggy in 0.4.4 and 0.4.3. They have been rewritten from scratch and work much more smoothly now. They also work in many instance where before they did not for instance graphite now works.

Floating point support

Another small but important change is allowing floating point support in filters the only real benefit of this currently is that you can specify 2.3G as well as check_pdh where you can now check floating point counters. But as this required a major overhaul of the filter framework it has been long in the making and another important milestone.

Filtering on count

count has been a pain for many people as it accidentally broke in 0.4.3. The problem was that the filter evaluates count on every item this means count < 5 will always be true as the first time count is always 1. This has been fixed and in many cases it will just work. There is however a case when you need to change your filter and that is when you check BOTH count and a value. For instance the following will not work: “count < 5 and load > 20%”. The problem is that count is not available when we check load. Thus “the first core” ends up again in an impossible state count is unknown and once count is known the various cpu values are no longer available. To work around this you can split them up into multiple filters like so: “filter=count<5” “filter=load>20%”.

Breaking changes

  • check_eventlog: Event-log rendering failures are now returned as messages instead of NSClient++ log messages
  • check_eventlog: Change the default event-log command to set warnings for warning in event-log and crit for errors (instead of using count)
  • check_drivesize: default check only checks mounted drives
  • External Scripts: #238 Added killer so processes are killed when NSClient++ exits
  • check_tasksched: Renamed status to task_status as it clashes with regular status (Fixed #170)
  • check_service: removed trigger detection from default service state Fixes #249 I think this causes more confusion then good so for people who want this please do it manually instead… i.e. add “and is_trigger = 0”

Bugfixes enhancements

The full list of changes (sorted by context) can be found here:

Additions

  • New command: check_and_forward to run a check forward check results as passive data check_and_forward command=check_cpu target=NSCA
  • New command check_nscp_version to check the NSClient++ version
  • New experimental command: check_network
  • New command check_ping (CheckNet module)
  • New concept metrics
  • New Module and check: check_nscp to check nscp via REST

Check commands

filters and checks (global)
  • Added support for both g and G in size filters
  • Added total=all to have total object match all items regardless of if they match the filter or not.
  • Added escape-html flag to all check commands and escape html option to all real time filters.
  • Added multiple filter/warn/crit expressions to allow using count separate from items thresholds
  • Fixed #297 like not ignoring case when comparing strings in filters
  • filters: Fixed #292 Unicode support for filters
  • Moved filter object handling inside filter engine
  • Fixed #294: count not working properly when used together with a regular check
  • Fixed #280 parsing filters with large numbers
  • Improved help and documentation
  • Fixed #117 invalid perf syntax when specifying none
  • Fixed #201 negative perf data in checks
  • Fixed potential issue with time expressions which are empty
  • Fixed #114 issue with empty-state which was ignored

check_drivesize:

  • Added ignore-unreadable which was apparently removed at some point, new behavior as it uses the drive flags to determine status
  • Added support for check_drivesize exclude=c and exclude=d: to make backwards compatibility better
  • Added support for parsing floating point numbers in size units (i.e. “warn=free > 2.5G”)
  • Added ignore-unreadable which was apparently removed at some point, new behavior as it uses the drive flags to determine status
  • Possible fix for #265 Added drive flags to filter and distinguish between drive characteristics New flags for filters and syntax are: hotplug, removable, readable, writable and erasable
  • Fixed #123 CheckAllOthers
  • Improved errors and logging
  • Possible fix for #265 Added drive flags to filter and distinguish between drive characteristics

check_pdh:

  • Added float support to check_pdh
  • Fixed indexes expansion for check_pdh
  • Fixed reload of counters
  • Fixed some PDH issues and improved error reporting
  • Fixed #296 and #310 compatibility issue with checkCounter and type=double
  • Fixed #272 floating point performance data

check_eventlog:

  • Added task, keywords GUID support for eventlog filters
  • Fixed #204 Fixed message rendering for eventlog messages
  • Fixed #204 Added support for modern API to real time eventlogs
  • Fixed #204 Fixed int to string conversion filter issue
  • Improved eventlog cli to support listing tasks and keywords

check_log_files:

  • Added support for space and strings in column split/line split
  • Fixed #198 CheckLogFIle not working if files does not exist on startup
  • Fixed CheckLogFIle realtime

check_files:

  • Added support for showing file dates in local time (not just UTC)
  • Fixed #231 added filter on folder and file: check_files path=c:/source/py “filter=type = ‘file’” show-all “detail-syntax=%(path)/%(file): %(type)”
  • Fixed check_files empty message to say files not drives.

check_tasksched:

  • Fixed #274 Added has_run to check_tasksched
  • Fix for #215 added new option to check_tasksched to force using the old api check_tasksched force-old

check_uptime:

  • Fixed missing return data in check_uptime

check_service:

  • Added support for classifying service and filtering services based on classification
  • Fixed #131 Added support for service= to check_service
  • Removed trigger detection from default service state Fixes #249 I think this causes more confusion then good so foe people who want this please do it manually instead… i.e. add “and is_trigger = 0”
  • Fixed #130 missing perf data on check_service (intentionally no perf for “all services” as that is a long list)

check_process:

  • Added total to check_process

check_wmi:

  • improved error handling for wmi queries

check_always_ok:

  • Fixed missing error for check_always_ok when command failed

check_multi:

  • Fixed #301 check_multi does not parse qouted commands properly (possibly breaking change)

check_nscp:

  • Added SSl support and general fixes to check_nscp (check_remote_nscp)
  • Added preliminary REST client

check_ping:

  • Fixed #140 check_ping syntax string

scheduler:

  • Added support for cron statements: schedule = 47 * *
  • Fixed scheduler reload not resetting old schedules
  • Change so scheduler logs actual file sinstead of wrapper.

Scripts

External Scripts:

  • Added option capture output to external script to disable output capture and handled inheritance (Fixes #232)
  • Added -noprofile to powershell script wrapper (#207)
  • Added ability to run visual processes in the UI session. Two new keywords: display controls if the process is showed and session controls which process the session is run in.
  • Added support for %ARGS% as well as $ARGS$
  • Fixed #223 broken %ARGS% with the $ARGS$ fix
  • Fixed #207 Return error when powershell script not found
  • Fixed #196 missing version
  • Fixed #196 uppercase aliases not working
  • Improved security when external scripts fails command lines which may contain password are no longer returned

PythonScript:

  • Added target_mode flag to exec to diverge between targeted commands and generic ones.
  • Fixed sample python script cli
  • Fixed exec alias in python script cli
  • Fixed python to string (will log better when it cant convert to string)
  • Fixed so python script will unregister its commands on reload

bundled scripts:

  • Fixed sample timeout script (check_60s.bat)

Protocols

NRPE:

  • Added support for -a to check_nrpe command (Fixed #158 )
  • Added –port to nscp nrpe install
  • Fixed #261 Invalid return from check_nrpe
  • Fixed #258 issue with check_nrpe and ssl=false option

NSCA:

  • Fixed #315 Fixed invalid host_check for NSCA
  • Fixed #300 alias not used when sending NSCP results

NRDP:

  • Fixed #285 missing hostname for NRDP

check_nt:

  • Fixed #157 CDROM included in check_nt DISKUSAGE
  • Fixed some potential check_nt issues
  • Fixed #202 check_nt not working in 0.5.x
  • Fixed a few potential crashes with check_nt

graphite:

  • Added error when there is no data to send in graphite sender
  • Added (optional) sending status to graphite
  • Added flag to disable sending perf data to graphite
  • Fixed (back) formatting in graphite
  • Fixed default values in graphite
  • Fixed graphite paths for metrics
  • Fixed sending graphite data so it works :)
  • Fixed so source host names is set in NSCA and Graphite client

collectd:

  • Added brand new collectd client

Core

Various/Core:

  • Added exec from clients (i.e. web and test)
  • Added support for unregistring commands
  • Added restart to nscp service
  • Fixed #259 parsing extra space on command line
  • Fixed services showing twice in service list
  • Fixed pressing ctrl+c on command line
  • Fixed missing default section in settings
  • Improved error handling for channel failures
  • removed reading one line schedules
  • Fixed issue with reading invalid config values
  • Fixed inheritance and path issue with settings objects
  • Service wont restart if it is not started when crashed (ie. running in test mode for instance)
  • Fixed locale setting (and error logging for service related errors)
  • Fixed erratic segv in clients when socket is closed twice
  • Improved handling around connection failures
  • Improved the crash reporter syntax: reporter.exe send b1438ab2-20a3-4b2d-bc30-7c3033c084e1.dmp
  • Improved #275 so next settings system is tried if the first one failes
  • Fixed settings key issue
  • Fixed #132 round issue with performance data and %
  • Fixed compatibility issue when using both MaxWarn and MinWarn

Metrics:

  • Added network metrics
  • Added metrics submission and fetching to python scripts
  • Added test client command metrics to display all metrics
  • Added metrics to internal scheduler
  • Added thread count to scheduler metrics
  • Added metrics support to clients and added metrics sending to graphite client
  • Added metrics to scheduler

WebUI:

  • Added ugly but working filter list for metrics
  • Added filtering to query and module
  • Added feedback to loading modules as well as proper save
  • Added support for allowed hosts to WEBServer
  • Fixed 220: WebPage not loading in IE
  • Fixed option bug in WEBServer command line
  • Fixed metrics in WebUI
  • Fixed collection strategy value in web ui
  • Fixed metrics (with new py)
  • Fixed logout/login issue in webui
  • Fixed disk graphs in WebUI
  • Fixed some issues in setting dialog
  • WebUI added help to settings dialog (about tabs)
  • WEBUI: Changed to save menu is always shown and added auto save support
  • Fixed password command line for the web client (i.e. nscp web password)
  • Fixed command lines for password in WEB UI
  • Updated mongoose Fixing: Busy wait when socket failes

MSI installer:

  • (re)added CONF_SET installer key
  • Added support for setting arbitrary keys on command line

Linux:

  • Brand new much improved linux packages

API:

  • Started to cleanuup “return codes” in the API which has been all over the place before. This will most likely fix all NSCA/Scheduler issues
  • Removed encrypt from the API has it has not been implemented for some time

Whats new in 0.4.4