Redgate Monitor 14

Improving performance of SQL Agent job history sampling

Redgate Monitor samples agent job history once a minute by querying msdb.dbo.sysjobhistory. On instances where this table has grown very large, that query can run for a long time and put significant pressure on tempdb.

Maintenance of msdb.dbo.sysjobhistory is the DBA's responsibility

msdb.dbo.sysjobhistory is a system table managed by SQL Server Agent (Microsoft Learn: dbo.sysjobhistory). Keeping it at a reasonable size is part of standard SQL Server administration:

  • Set a sensible row cap on the SQL Agent's history retention. See Microsoft's guidance: Resize the Job History Log (right-click SQL Server Agent → Properties → History, or use sp_set_sqlagent_properties).
  • Run sp_purge_jobhistory on a schedule for instances where the row cap alone isn't enough.

If sysjobhistory already contains millions of rows, applying retention/purge is the most effective fix. The indexes below are a complementary measure for instances where DBAs must keep large amounts of history.

Recommended indexes (for instances with large sysjobhistory)

The two indexes below have been validated by customers running with very large job-history tables. They are non-filtered, so they don't interfere with SQL Agent writing new history rows.

SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT, QUOTED_IDENTIFIER, ANSI_NULLS ON;
GO
CREATE NONCLUSTERED INDEX [rg_sysjobhistory_jobid_stepid_instanceid]
ON [msdb].[dbo].[sysjobhistory] (
                                    [job_id] ASC,
                                    [step_id] ASC,
                                    [instance_id] DESC
                                );
GO

SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT, QUOTED_IDENTIFIER, ANSI_NULLS ON;
GO
CREATE NONCLUSTERED INDEX [rg_sysjobhistory_jobid_instanceid_status]
ON [msdb].[dbo].[sysjobhistory] (
                                    [job_id] ASC,
                                    [instance_id] ASC
                                )
INCLUDE (
            [step_id],
            [step_name],
            [message],
            [run_status]
        );
GO



Redgate Monitor does not use index hints, so you may rename these indexes if your shop has a naming convention.

Older single-index recommendation

Earlier versions of this page recommended a single filtered index:

SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT, QUOTED_IDENTIFIER, ANSI_NULLS ON;
GO
CREATE NONCLUSTERED INDEX [rg_job_history]
ON [msdb].[dbo].[sysjobhistory] (
                                    [job_id] ASC,
                                    [instance_id] ASC,
                                    [step_id] ASC
                                )
WHERE ([step_id] = (0));
GO



This index helps in many environments but, on instances with very large sysjobhistory, it doesn't fully address the second nested-loop scan in the sampler's query. We recommend the two non-filtered indexes above instead. Filtered indexes on sysjobhistory have, in some environments, been associated with SQL Agent failing to write new history rows; if you see agent-logging issues after deploying any filtered index here, drop it.

Why these indexes help

The job-history sampler runs a query that, for each recent overall job step, looks up the failed steps belonging to that run. On large tables, the optimizer can choose a scan against sysjobhistory for that lookup, which is what causes the multi-hour runtime and tempdb pressure. The two indexes above give the optimizer the access paths it needs to seek instead of scan.

Warning in Redgate Monitor

If msdb.dbo.sysjobhistory exceeds approximately 1 million rows, Redgate Monitor logs a warning on the monitored server's log page linking back to this page. The warning is informational, it doesn't disable any sampling  and clears once you bring the row count down.

Customizing the threshold

The default warning threshold is 1,000,000 rows. To override it, set a key-value pair in the Redgate Monitor repository database. The setting applies to every monitored SQL Server on this Base Monitor — there is no per-server override.

SQL Server repository:

-- First time: insert
INSERT INTO settings.KeyValuePairs (KeyName, KeyValue, KeyValueType)
VALUES (N'JobHistory-RowCountWarningThreshold', N'5000000', N'NUMERIC');

-- Already set: update
UPDATE settings.KeyValuePairs
SET    KeyValue = N'5000000'
WHERE  KeyName  = N'JobHistory-RowCountWarningThreshold';

PostgreSQL/ TimescaleDB repository: 

INSERT INTO settings.key_value_pairs (key_name, key_value, key_value_type)
VALUES ('JobHistory-RowCountWarningThreshold', '5000000', 'NUMERIC')
ON CONFLICT (key_name) DO UPDATE SET key_value = EXCLUDED.key_value;

Restart the Base Monitor service for the change to take effect. If you've already implemented the recommended indexes and are happy with the current size, you can silence the warning by setting the threshold above your current sysjobhistory row count. Use SELECT COUNT_BIG(*) FROM msdb.dbo.sysjobhistory; to find that. 


Didn't find what you were looking for?