Redgate Monitor Native High Availability

Redgate Monitor can also be configured to have native High Availability for Base Monitors.

Overview

In a native High Availability setup, multiple Base Monitors are configured to use the same repository.

At all times, at most one Base Monitor is active, while all others are designated passive and do not write to the repository. Base Monitors in a passive state also do not respond to TCP health checks from load balancers. The active Base Monitor provides a heartbeat to the repository, and if it does not update its heartbeat for a defined election delay, another Base Monitor will be elected active and take over.

Installation

We currently recommend one of two configurations. Either:

  1. Two or more Base Monitors configured to use the same repository behind a load balancer, with one Website. 
  2. Two or more Base Monitor & Website pairs: Website behind a load balancer, each Website with its own Base Monitor, while Base Monitors use the same repository.

You'll find configuration steps related to the Base Monitor installation in this page. Regarding Website High Availability configuration, please refer to the Installing Redgate Monitor with High Availability.

When installing two Base Monitors pointing to the same repository, do not run the installers simultaneously to avoid corruption of the repository.

Website Configuration

  1. Ensure the environment variable SQLMONITOR_LicensingIdentity is set to a GUID value unique to each installation. 

Ensure the permit file redgate.permit generated for the above licensing identity is placed in the %programdata%\Red Gate\SQL Monitor\Permits\  folder. Follow Offline activation for details on how to generate this file.

  1. If your environment works in an offline mode, ensure that the environment variable SQLMONITOR_ForceOfflineLicensing is set to true.

Base Monitor Configuration

  1. For each of the Base Monitor installations, ensure that the environment variable SQLMONITOR_HighAvailability is set to true

  2. Set up a shared location for Encryption Keys and Top Query Search.
  3. Native high availability utilizes Windows Services restart functionality. In order for this to function correctly, you will need to configure your Base Monitor service recovery options. This can be done with the following PowerShell command run with administrator privileges:
# This command will make all Base Monitors always restart on error, 
# with a 60 second delay (delay in milliseconds) before starting after shutting down
# and resetting failure count in a day (reset parameter in seconds)
sc failure MonitorBaseDeploymentService reset=86400 actions=restart/60000

The above command will make all Base Monitors always restart on error, with a 60 second wait before starting after shutting down. Parameters given in the example command are recommended, however you can change them to suit your conditions. 

Keep in mind that any configuration for Base Monitors that requires a change of an environment variable or in a file in either the Redgate Monitor installation directory or under %ProgramData%\Red Gate\SQL Monitor should be applied to all Base Monitor installations.

Redgate Monitor native High Availability does not support Active Directory via LDAP. If you need Active Directory, please configure it via OIDC instead. 

If you have multiple Websites in your setup, please restart all Website services after changing the configuration for the authentication type.

Secondary Base Monitors

If you are configuring secondary Base Monitors behind a load balancer, to be added to a high availability setup with two or more Base Monitor & Website pairs, please follow the instructions for having thumbprints for all Websites in the set-up on the secondary Base Monitors' configuration files. You can find the instructions here. This is so that the secondary Base Monitors can recognize all Websites in the set-up as authorized clients. 

High Availability Configuration

The following attributes are configurable from the RedGate.SqlMonitor.Engine.Alerting.Base.Service.exe.settings.config file under %ProgramData%\Red Gate\SQL Monitor. Values shown in the example usage below are the default values. 

  • Heartbeat rate: The rate at which the Base Monitor performs heartbeat checks in seconds.
  • Election delay: The time, in seconds, a Base Monitor will wait before it elects itself as active. The election delay exists to prevent failing over during minor connectivity issues. If the election delay is set too small, you may risk having failovers too often, and it is worth noting during the failover monitoring does stop. Whereas if it is set too large, you will have monitoring gaps during the failovers. 
  • Heartbeat SQL connection timeout: The time, in seconds, that the heartbeat SQL command is allowed to run before being terminated. 

Example usage:

<highAvailabilitySettings heartbeatRate="2" electionDelay="30" heartbeatSqlConnectionTimeout="5" />




Didn't find what you were looking for?