Service Checks

Explanation of service checks on the Health tab

Services are checked by the monitoring daemon:

At regular intervals, as defined by the check_interval and retry_interval options in the service definitions.
On-demand as needed for predictive service dependency checks.

On-demand checks are performed as part of the predictive service dependency check logic. These checks help ensure that the dependency logic is as accurate as possible. If the system does not use service dependencies, monitoring will not perform any on-demand service checks. For more information on service dependencies, refer to the Icinga documentation.

Parallelization of Service Checks

Scheduled service checks are run in parallel. When monitoring must run a scheduled service check, it initiates the service check and then returns to doing other work (running host checks, and so on). The service check runs in a child process that was forked from the main daemon. When the service check has completed, the child process will inform the main monitoring process (its parent) of the check results. The main monitoring process then handles the check results and takes appropriate action (running event handlers, sending notifications, and so on).

On-demand service checks are also run in parallel if needed. Monitoring can forego the actual execution of an on-demand service check if it can use the cached results from a relatively recent service check.

Service States

Services that are checked can be in one (1) of four (4) different states:

Table 1. Service States
State	Criteria
OK	The round-trip average (RTA) is less than 200ms and the packet loss is less than 20%.
WARNING	The round-trip average (RTA) is greater than 200ms or the packet loss is 20% or more.
CRITICAL	The round-trip average (RTA) is greater than 600ms or the packet loss is 60% or more.
UNKNOWN	The state cannot be determined.

Access the Service Properties

Service State Determination

Service checks are performed by plug-ins, which can return a state of OK, WARNING, UNKNOWN, or CRITICAL. These states directly translate to service states.

Service State Changes

When monitoring checks the status of services, it detects when a service changes between OK, WARNING, UNKNOWN, and CRITICAL states, and then takes appropriate action. These state changes result in different state types (HARD or SOFT), which can trigger event handlers to be run and notifications to be sent out. Service state changes can also trigger on-demand host checks. Detecting and dealing with state changes is what health monitoring is all about.

When services change state too frequently, they are considered to be flapping. Monitoring can detect when services start flapping and can suppress notifications until flapping stops and the service state stabilizes.