Improving performance of SQL Agent job history sampling
Published 28 March 2024
Redgate Monitor samples agent job history once a minute by querying msdb.dbo.sysjobhistory. On instances where this table has grown very large, that query can run for a long time and put significant pressure on tempdb.
Maintenance of msdb.dbo.sysjobhistory is the DBA's responsibility
msdb.dbo.sysjobhistory is a system table managed by SQL Server Agent (Microsoft Learn: dbo.sysjobhistory). Keeping it at a reasonable size is part of standard SQL Server administration:
- Set a sensible row cap on the SQL Agent's history retention. See Microsoft's guidance: Resize the Job History Log (right-click SQL Server Agent → Properties → History, or use
sp_set_sqlagent_properties). - Run
sp_purge_jobhistoryon a schedule for instances where the row cap alone isn't enough.
If sysjobhistory already contains millions of rows, applying retention/purge is the most effective fix. The indexes below are a complementary measure for instances where DBAs must keep large amounts of history.
Recommended indexes (for instances with large sysjobhistory)
The two indexes below have been validated by customers running with very large job-history tables. They are non-filtered, so they don't interfere with SQL Agent writing new history rows.
SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT, QUOTED_IDENTIFIER, ANSI_NULLS ON;
GO
CREATE NONCLUSTERED INDEX [rg_sysjobhistory_jobid_stepid_instanceid]
ON [msdb].[dbo].[sysjobhistory] (
[job_id] ASC,
[step_id] ASC,
[instance_id] DESC
);
GO
SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT, QUOTED_IDENTIFIER, ANSI_NULLS ON;
GO
CREATE NONCLUSTERED INDEX [rg_sysjobhistory_jobid_instanceid_status]
ON [msdb].[dbo].[sysjobhistory] (
[job_id] ASC,
[instance_id] ASC
)
INCLUDE (
[step_id],
[step_name],
[message],
[run_status]
);
GO
Redgate Monitor does not use index hints, so you may rename these indexes if your shop has a naming convention.
Older single-index recommendation
Earlier versions of this page recommended a single filtered index:
SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT, QUOTED_IDENTIFIER, ANSI_NULLS ON;
GO
CREATE NONCLUSTERED INDEX [rg_job_history]
ON [msdb].[dbo].[sysjobhistory] (
[job_id] ASC,
[instance_id] ASC,
[step_id] ASC
)
WHERE ([step_id] = (0));
GOThis index helps in many environments but, on instances with very large sysjobhistory, it doesn't fully address the second nested-loop scan in the sampler's query. We recommend the two non-filtered indexes above instead. Filtered indexes on sysjobhistory have, in some environments, been associated with SQL Agent failing to write new history rows; if you see agent-logging issues after deploying any filtered index here, drop it.
Why these indexes help
The job-history sampler runs a query that, for each recent overall job step, looks up the failed steps belonging to that run. On large tables, the optimizer can choose a scan against sysjobhistory for that lookup, which is what causes the multi-hour runtime and tempdb pressure. The two indexes above give the optimizer the access paths it needs to seek instead of scan.
Warning in Redgate Monitor
If msdb.dbo.sysjobhistory exceeds approximately 1 million rows, Redgate Monitor logs a warning on the monitored server's log page linking back to this page. The warning is informational, it doesn't disable any sampling and clears once you bring the row count down.
Customizing the threshold
The default warning threshold is 1,000,000 rows. To override it, set a key-value pair in the Redgate Monitor repository database. The setting applies to every monitored SQL Server on this Base Monitor — there is no per-server override.
SQL Server repository:
-- First time: insert INSERT INTO settings.KeyValuePairs (KeyName, KeyValue, KeyValueType) VALUES (N'JobHistory-RowCountWarningThreshold', N'5000000', N'NUMERIC'); -- Already set: update UPDATE settings.KeyValuePairs SET KeyValue = N'5000000' WHERE KeyName = N'JobHistory-RowCountWarningThreshold';
PostgreSQL/ TimescaleDB repository:
INSERT INTO settings.key_value_pairs (key_name, key_value, key_value_type)
VALUES ('JobHistory-RowCountWarningThreshold', '5000000', 'NUMERIC')
ON CONFLICT (key_name) DO UPDATE SET key_value = EXCLUDED.key_value;Restart the Base Monitor service for the change to take effect. If you've already implemented the recommended indexes and are happy with the current size, you can silence the warning by setting the threshold above your current sysjobhistory row count. Use SELECT COUNT_BIG(*) FROM msdb.dbo.sysjobhistory; to find that.