cache: drop search cache TTL from 30s to 2s for create-then-search freshness

Integration tests on the postgres-os-redis profile caught a real correctness
regression: tests that create an entity and Awaitility-poll for it to appear
in search timed out at 30s because our 30s search TTL pinned the
pre-create empty result for the entire test window. Same issue surfaces
in production: a user creates a domain / table / dashboard and immediately
searches for it would see "no results" for up to 30s.

2s caps the staleness while still catching the dominant UI access pattern:
multiple components in the same render frame fire identical search queries.
Those happen within milliseconds, well inside any reasonable TTL.

The longer-term fix is search-cache invalidation on entity writes (a
generation counter per entity-type, search keys include the generation,
writes bump the generation). That's design-doc-tracked in
.context/cache-improvements-design.md but deferred — the 2s TTL is good
enough for now, and the more complete invalidation strategy can be a
follow-up PR with its own dedicated tests.

Failing tests under 30s TTL that this fixes:
  - DomainAssetsColumnExclusionIT (domain create-then-search)
  - LineageImpactAnalysisIT (owner removal reflected in search)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sriharsha Chintalapani 2026-05-09 23:21:38 -07:00
parent a6a6cb9ffd
commit 41489056ff

View file

@ -19,11 +19,15 @@ public class CacheConfig {
public int relationshipTtlSeconds = 3600; // 1 hour
public int tagTtlSeconds = 3600; // 1 hour
// /api/v1/search/query response cache. Short TTL because search hits ES which usually
// has its own request cache; 30s catches the typical "user types and re-searches the
// same thing within a minute" pattern without serving badly stale results after writes.
// Set to 0 to disable.
public int searchTtlSeconds = 30;
// /api/v1/search/query response cache. Very short TTL because search results must
// reflect entity writes promptly: user creates X searches for X expects to find X.
// The integration suite caught this with a 30s TTL: tests that create an entity and
// wait for it in search timed out because the cache served the pre-create empty
// result for the full 30s. 2s caps that staleness while still catching the typical
// UI pattern where multiple components in the same render frame fire identical search
// queries (those happen within milliseconds, well inside any reasonable TTL). Set to
// 0 to disable.
public int searchTtlSeconds = 2;
// /api/v1/lineage/* response cache. Hybrid TTL + direct-invalidation strategy: a 60s TTL
// backstops cases where a transitive change (an entity deep in the cached graph) wasn't