AI SQL Tuner Recommendations

Tuning goal: Index Tuning

Online index ops available Compression available SQL Server 2022 IQP & PSP Auto stats create/update: ON

Executive summary

  1. Add a targeted nonclustered index on dbo.PostHistory(PostId) including PostHistoryTypeId and (optionally) CreationDate Missing index: very high cost/impact
    • Current usage shows dbo.PostHistory is scan-heavy (234 scans on clustered) with no supporting NCIs listed.
    • Directly supports multiple cached queries joining/filtering on PostHistory.PostId + PostHistoryTypeId and “latest by CreationDate”.
  2. Add a composite index on dbo.Votes(UserId, VoteTypeId) INCLUDE BountyAmount Missing index + scan-heavy table
    • dbo.Votes clustered index shows 558 scans and no seeks in the sample; multiple queries aggregate by UserId and filter by VoteTypeId.
    • Reduces CPU/reads on vote rollups and “recent votes” patterns.
  3. Add two high-value dbo.Posts indexes for common joins and date filters:
    • (OwnerUserId, CreationDate) INCLUDE common select/aggregate columns
    • (PostTypeId, CreationDate) INCLUDE Score, ViewCount, Title
    Plan cache: very high reads
    • dbo.Posts has 1095 scans on clustered; plan cache shows multi-million logical reads dominated by joins to Posts and date predicates.
  4. Fix “unindexable” tag searches: replace Posts.Tags LIKE '%'+T.TagName+'%' with a normalized tag mapping table (or persisted parsed tags) Major CPU/reads reduction
    • Leading-wildcard LIKE prevents efficient index usage and forces scans; several top queries do this.
  5. Query correctness fixes that also improve plans (several joins are logically wrong) High impact
    • Examples: JOIN PostDetails pd ON tu.UserId = pd.PostId; cp.AcceptedAnswerId = ups.UserId; PostLinksSummary pls ON ups.TotalPosts > 0. These produce huge, unnecessary row explosions.
Columnstore note: no “Large Tables (>1M rows)” were reported, so columnstore recommendations are intentionally omitted.

Environment & signals

Key signals from provided data

  • Missing indexes:
    • dbo.PostHistory on PostId INCLUDE PostHistoryTypeId (very high avg cost 53396, impact 99.52).
    • dbo.Votes on UserId INCLUDE VoteTypeId, BountyAmount (high cost/impact).
    • dbo.PostHistory on (CreationDate, UserId) INCLUDE Comment (moderate).
  • Index usage (read vs write): user_updates shown are 0 for all listed indexes (likely a read-only workload snapshot). This reduces concern about write amplification, but recommendations still minimize redundant overlap.
  • Plan cache: multiple queries average millions of logical reads; the worst shows ~11.5M reads and ~0.69s CPU per execution (avg), indicating scan-heavy joins and row explosions.

Current indexing gaps implied by usage

  • dbo.PostHistory and dbo.Votes are scan-dominant on their clustered PKs.
  • dbo.Posts clustered index is heavily scanned; supporting nonclustered indexes for common access patterns are not shown.
  • Users.DisplayName index shows far more scans than seeks; this typically indicates non-SARGable predicates or low selectivity / mismatched leading key usage (code changes recommended rather than dropping).

Detailed prioritized recommendations

1) Indexes: dbo.PostHistory

  1. Create NCI to support joins on PostId and filters on PostHistoryTypeId
    • Why: Matches missing index with extremely high cost/impact and aligns with cached query pattern: PostHistory joined by PostId with PostHistoryTypeId IN (...), plus “latest by CreationDate”.
    • Overhead: Adds storage + maintenance on PostHistory writes; justified given scan-heavy access and high estimated benefit. Keep keys minimal.
    • Design: Key (PostId, PostHistoryTypeId, CreationDate DESC); INCLUDE (UserId, Comment) to cover “latest edit comment” query without lookups.
    CREATE NONCLUSTERED INDEX IX_PostHistory_PostId_Type_CreationDate
    ON dbo.PostHistory (PostId, PostHistoryTypeId, CreationDate DESC)
    INCLUDE (UserId, Comment)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  2. Create NCI for “latest activity by User” lookups
    • Why: Plan cache includes: LEFT JOIN PostHistory PH ON T.UserId = PH.UserId AND PH.CreationDate = (SELECT MAX(... ) WHERE UserId = ...).
    • Design: Key (UserId, CreationDate DESC) INCLUDE (Comment, PostId, PostHistoryTypeId).
    • Overhead: Second index on same table; justified because this pattern is different from PostId-based access and otherwise forces scans/sorts.
    CREATE NONCLUSTERED INDEX IX_PostHistory_UserId_CreationDate
    ON dbo.PostHistory (UserId, CreationDate DESC)
    INCLUDE (PostId, PostHistoryTypeId, Comment)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  3. Do not implement the missing index “(CreationDate, UserId) INCLUDE (Comment)” as-is
    • Why: It overlaps poorly with the actual “MAX(CreationDate) per UserId/PostId” usage; the two indexes above more directly target the patterns and avoid a third partially redundant structure.

2) Indexes: dbo.Votes

  1. Create NCI for aggregations by UserId and filters by VoteTypeId
    • Why: Missing index recommendation + clustered index shows 558 scans and 0 seeks. Cached queries frequently do:
      • WHERE V.CreationDate >= ... GROUP BY V.UserId
      • SUM(CASE WHEN VoteTypeId = 2/3 THEN ...) grouped by user
    • Design: Key (UserId, VoteTypeId) INCLUDE (BountyAmount, PostId, CreationDate) to help both “recent votes” and joins from Votes to Posts.
    CREATE NONCLUSTERED INDEX IX_Votes_UserId_VoteTypeId
    ON dbo.Votes (UserId, VoteTypeId)
    INCLUDE (BountyAmount, PostId, CreationDate)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  2. Create NCI for vote rollups by PostId (common in plan cache)
    • Why: Multiple queries aggregate votes per post and filter by VoteTypeId (2,3,8) and join on V.PostId.
    • Design: Key (PostId, VoteTypeId) INCLUDE (UserId, BountyAmount).
    • Overhead: Adds a second Votes index; justified because it supports a different leading key than (UserId,...). If write-heavy in reality, evaluate keeping only one based on workload; given provided scans, both are likely beneficial.
    CREATE NONCLUSTERED INDEX IX_Votes_PostId_VoteTypeId
    ON dbo.Votes (PostId, VoteTypeId)
    INCLUDE (UserId, BountyAmount)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  3. Optional filtered index for bounty-specific access
    • Why: Several queries compute average/total bounty with VoteTypeId = 8. A filtered index can be much smaller and faster if VoteTypeId=8 is rare.
    • Tradeoff: Only helps VoteTypeId=8 queries; keep only if those are important.
    CREATE NONCLUSTERED INDEX IX_Votes_Bounties_PostId
    ON dbo.Votes (PostId)
    INCLUDE (BountyAmount, UserId, CreationDate)
    WHERE VoteTypeId = 8
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

3) Indexes: dbo.Posts

  1. Create NCI for user-owned posts with date filters and ranking
    • Why: Many cached queries group/filter by OwnerUserId and CreationDate (last 30 days / last year), and then sort by Score or aggregate ViewCount.
    • Design: Key (OwnerUserId, CreationDate); INCLUDE (Id, Score, ViewCount, PostTypeId, AcceptedAnswerId, Title) to reduce lookups.
    CREATE NONCLUSTERED INDEX IX_Posts_OwnerUserId_CreationDate
    ON dbo.Posts (OwnerUserId, CreationDate)
    INCLUDE (Id, PostTypeId, Score, ViewCount, AcceptedAnswerId, Title, LastActivityDate)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  2. Create NCI for questions/answers by type and date (top posts windows)
    • Why: Queries filter by PostTypeId and CreationDate and order/rank by Score/ViewCount.
    • Design: Key (PostTypeId, CreationDate); INCLUDE (Id, Score, ViewCount, Title, OwnerUserId).
    CREATE NONCLUSTERED INDEX IX_Posts_PostTypeId_CreationDate
    ON dbo.Posts (PostTypeId, CreationDate)
    INCLUDE (Id, Score, ViewCount, Title, OwnerUserId)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  3. Create NCI to support self-references and joins on AcceptedAnswerId / ParentId
    • Why: Cached queries reference AcceptedAnswerId and there are FKs on ParentId and AcceptedAnswerId. Without supporting indexes, SQL Server often scans Posts for these joins.
    • Design: Two narrow indexes.
    CREATE NONCLUSTERED INDEX IX_Posts_AcceptedAnswerId
    ON dbo.Posts (AcceptedAnswerId)
    INCLUDE (Id, OwnerUserId, PostTypeId, CreationDate)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
    
    CREATE NONCLUSTERED INDEX IX_Posts_ParentId
    ON dbo.Posts (ParentId)
    INCLUDE (Id, OwnerUserId, PostTypeId, CreationDate)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

4) Indexes: dbo.Users

  1. Add a covering index for “high reputation” pages
    • Why: Cached queries filter Users.Reputation > 1000 and order by Reputation DESC with OFFSET/FETCH. You already have IX_Users_Reputation</code with strong seek usage (135 seeks), so this is only recommended if that index does not include needed columns and plans show key lookups.
    • Design: If lookups occur, replace/extend with INCLUDE to cover common projections (DisplayName, CreationDate).
    /* If IX_Users_Reputation exists but causes key lookups, consider:
       (Use DROP/CREATE only if you confirm it's non-unique and safe to replace) */
    CREATE NONCLUSTERED INDEX IX_Users_Reputation_Cover
    ON dbo.Users (Reputation DESC)
    INCLUDE (Id, DisplayName, CreationDate)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  2. Keep IX_Users_DisplayName for now (do not drop based on this snapshot)
    • Why: 516 scans vs 9 seeks could indicate non-SARGable search patterns (e.g., LIKE '%...') rather than “useless index”. Dropping it may hurt other workloads.

5) Indexing strategy: Tags / Posts.Tags LIKE '%'+T.TagName+'%'

  1. Stop using leading-wildcard LIKE against Posts.Tags; normalize to a mapping table
    • Why: This pattern is inherently non-SARGable; indexes on Posts.Tags won’t be used efficiently. Several expensive cached queries do this, contributing to large scans/CPU.
    • Recommended design: Create dbo.PostTags(PostId, TagId) (or TagName) and populate from Posts.Tags. Then index (TagId, PostId) and (PostId, TagId).
    /* One-time schema addition */
    CREATE TABLE dbo.PostTags
    (
        PostId  int NOT NULL,
        TagId   int NOT NULL,
        CONSTRAINT PK_PostTags PRIMARY KEY CLUSTERED (PostId, TagId),
        CONSTRAINT FK_PostTags_Posts FOREIGN KEY (PostId) REFERENCES dbo.Posts(Id),
        CONSTRAINT FK_PostTags_Tags  FOREIGN KEY (TagId)  REFERENCES dbo.Tags(Id)
    );
    
    /* Supporting indexes for both access directions */
    CREATE NONCLUSTERED INDEX IX_PostTags_TagId_PostId
    ON dbo.PostTags (TagId, PostId)
    WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);
  2. Do not add an index on Posts.Tags to “fix” LIKE
    • Why: Leading wildcard (%) defeats b-tree seeks; the index would add write/storage cost without solving the core problem.

6) Statistics

  1. Update statistics (fullscan where practical) after creating the new indexes
    • Why: Ensures the optimizer immediately benefits; also helps with large memory grants and spill behavior seen in the cache.
    /* Run in SQLStorm (off-peak). FULLSCAN can be heavy; adjust as needed. */
    UPDATE STATISTICS dbo.PostHistory WITH FULLSCAN;
    UPDATE STATISTICS dbo.Votes      WITH FULLSCAN;
    UPDATE STATISTICS dbo.Posts      WITH FULLSCAN;
    UPDATE STATISTICS dbo.Users      WITH FULLSCAN;
  2. Enable async stats updates only if compile stalls are observed
    • Why: Currently OFF. This is not an index change; leave as-is unless stats update blocking becomes a problem.

7) Query code changes (high ROI, reduces need for extra indexes)

  1. Fix logically incorrect joins that cause row explosions
    • Examples from plan cache:
      • JOIN PostDetails pd ON tu.UserId = pd.PostId (UserId matched to PostId) → should likely join on OwnerUserId or similar.
      • LEFT JOIN PostLinksSummary pls ON ups.TotalPosts > 0 → cartesian expansion; should join pls.PostId = p.Id (or remove and compute differently).
      • ClosedPosts cp ON cp.AcceptedAnswerId = ups.UserId where AcceptedAnswerId references Posts.Id, not Users → likely intended cp.OwnerUserId = ups.UserId (or join via Posts).
    • Impact: Can dwarf index gains; often turns multi-million reads into thousands.
  2. Replace “latest row” correlated subqueries with APPLY + ordered seek
    • Pattern: ... AND ph.CreationDate = (SELECT MAX(CreationDate) ...)
    • Rewrite: Use OUTER APPLY (SELECT TOP (1) ... ORDER BY CreationDate DESC) which pairs perfectly with the recommended descending indexes.
    /* Example rewrite for "latest edit comment per post" */
    OUTER APPLY (
        SELECT TOP (1) ph.Comment, ph.CreationDate
        FROM dbo.PostHistory ph
        WHERE ph.PostId = pd.PostId
          AND ph.PostHistoryTypeId IN (4,6,24)
        ORDER BY ph.CreationDate DESC
    ) AS phLast
  3. Avoid COUNT(DISTINCT ...) + multi-join fanout where possible
    • Why: Several queries join Users→Posts→Votes and then COUNT(DISTINCT P.Id). This often forces big hashes/sorts and large memory grants.
    • Approach: Pre-aggregate per table first (Posts per user, Votes per post/user) in separate CTEs/temp tables, then join the aggregated results.
  4. Turn tag logic into relational joins (after PostTags is created)
    • Rewrite: JOIN Posts P ON P.Tags LIKE ... becomes JOIN PostTags PT ON PT.TagId = T.Id JOIN Posts P ON P.Id = PT.PostId.

8) Drop/keep notes (overhead-aware)

  • No index drops recommended from the provided snapshot:
    • User_updates are 0 (not showing write pressure), and existing indexes generally have some seeks/scans.
    • Dropping based on limited usage samples is risky; better to validate over a representative window.
  • No PK rebuild/drop recommended; therefore no FK drop/recreate scripts are needed.

All scripts (copy/paste)

Indexes: dbo.PostHistory

/* 1) PostHistory: PostId + Type + latest CreationDate */
CREATE NONCLUSTERED INDEX IX_PostHistory_PostId_Type_CreationDate
ON dbo.PostHistory (PostId, PostHistoryTypeId, CreationDate DESC)
INCLUDE (UserId, Comment)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

/* 2) PostHistory: latest activity per UserId */
CREATE NONCLUSTERED INDEX IX_PostHistory_UserId_CreationDate
ON dbo.PostHistory (UserId, CreationDate DESC)
INCLUDE (PostId, PostHistoryTypeId, Comment)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

Indexes: dbo.Votes

/* 3) Votes: rollups by user + type */
CREATE NONCLUSTERED INDEX IX_Votes_UserId_VoteTypeId
ON dbo.Votes (UserId, VoteTypeId)
INCLUDE (BountyAmount, PostId, CreationDate)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

/* 4) Votes: rollups by post + type */
CREATE NONCLUSTERED INDEX IX_Votes_PostId_VoteTypeId
ON dbo.Votes (PostId, VoteTypeId)
INCLUDE (UserId, BountyAmount)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

/* 5) Optional: filtered bounty index (VoteTypeId=8) */
CREATE NONCLUSTERED INDEX IX_Votes_Bounties_PostId
ON dbo.Votes (PostId)
INCLUDE (BountyAmount, UserId, CreationDate)
WHERE VoteTypeId = 8
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

Indexes: dbo.Posts

/* 6) Posts: owner + date window queries */
CREATE NONCLUSTERED INDEX IX_Posts_OwnerUserId_CreationDate
ON dbo.Posts (OwnerUserId, CreationDate)
INCLUDE (Id, PostTypeId, Score, ViewCount, AcceptedAnswerId, Title, LastActivityDate)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

/* 7) Posts: type + date window queries */
CREATE NONCLUSTERED INDEX IX_Posts_PostTypeId_CreationDate
ON dbo.Posts (PostTypeId, CreationDate)
INCLUDE (Id, Score, ViewCount, Title, OwnerUserId)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

/* 8) Posts: self-reference joins */
CREATE NONCLUSTERED INDEX IX_Posts_AcceptedAnswerId
ON dbo.Posts (AcceptedAnswerId)
INCLUDE (Id, OwnerUserId, PostTypeId, CreationDate)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

CREATE NONCLUSTERED INDEX IX_Posts_ParentId
ON dbo.Posts (ParentId)
INCLUDE (Id, OwnerUserId, PostTypeId, CreationDate)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

Indexes: dbo.Users (optional cover)

/* Optional: only if you observe key lookups on Users.Reputation queries */
CREATE NONCLUSTERED INDEX IX_Users_Reputation_Cover
ON dbo.Users (Reputation DESC)
INCLUDE (Id, DisplayName, CreationDate)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

Schema change for tag normalization (high ROI)

/* Recommended long-term fix for Posts.Tags LIKE '%...%' */
CREATE TABLE dbo.PostTags
(
    PostId  int NOT NULL,
    TagId   int NOT NULL,
    CONSTRAINT PK_PostTags PRIMARY KEY CLUSTERED (PostId, TagId),
    CONSTRAINT FK_PostTags_Posts FOREIGN KEY (PostId) REFERENCES dbo.Posts(Id),
    CONSTRAINT FK_PostTags_Tags  FOREIGN KEY (TagId)  REFERENCES dbo.Tags(Id)
);

CREATE NONCLUSTERED INDEX IX_PostTags_TagId_PostId
ON dbo.PostTags (TagId, PostId)
WITH (ONLINE = ON, DATA_COMPRESSION = PAGE, FILLFACTOR = 95);

Statistics updates (post-deploy)

UPDATE STATISTICS dbo.PostHistory WITH FULLSCAN;
UPDATE STATISTICS dbo.Votes      WITH FULLSCAN;
UPDATE STATISTICS dbo.Posts      WITH FULLSCAN;
UPDATE STATISTICS dbo.Users      WITH FULLSCAN;

Code pattern rewrite example (latest row)

/* Replace correlated MAX() with APPLY + TOP(1) to use IX_PostHistory_PostId_Type_CreationDate */
OUTER APPLY (
    SELECT TOP (1) ph.Comment, ph.CreationDate
    FROM dbo.PostHistory ph
    WHERE ph.PostId = pd.PostId
      AND ph.PostHistoryTypeId IN (4,6,24)
    ORDER BY ph.CreationDate DESC
) AS phLast;

Confidence

Recommendation area Confidence Rationale (brief)
PostHistory indexes (PostId/Type/CreationDate + UserId/CreationDate) High Direct match to missing index + multiple plan-cache “latest history” patterns; current access is scan-heavy.
Votes indexes (UserId/VoteTypeId and PostId/VoteTypeId) High Missing index + strong evidence of scans and frequent aggregation/join predicates.
Posts indexes (OwnerUserId/CreationDate, PostTypeId/CreationDate, AcceptedAnswerId, ParentId) Medium-High Plan cache shows Posts-centric workloads with date filters; usage shows clustered scans; exact column coverage may vary by query.
Normalize Tags (PostTags) High Leading-wildcard LIKE is fundamentally unindexable; normalization is the standard durable fix.
Users reputation cover index (optional) Medium Depends on whether current IX_Users_Reputation already covers projections; no lookup stats provided.