cache: drop search cache TTL from 30s to 2s for create-then-search freshness

Integration tests on the postgres-os-redis profile caught a real correctness regression: tests that create an entity and Awaitility-poll for it to appear in search timed out at 30s because our 30s search TTL pinned the pre-create empty result for the entire test window. Same issue surfaces in production: a user creates a domain / table / dashboard and immediately searches for it would see "no results" for up to 30s. 2s caps the staleness while still catching the dominant UI access pattern: multiple components in the same render frame fire identical search queries. Those happen within milliseconds, well inside any reasonable TTL. The longer-term fix is search-cache invalidation on entity writes (a generation counter per entity-type, search keys include the generation, writes bump the generation). That's design-doc-tracked in .context/cache-improvements-design.md but deferred — the 2s TTL is good enough for now, and the more complete invalidation strategy can be a follow-up PR with its own dedicated tests. Failing tests under 30s TTL that this fixes: - DomainAssetsColumnExclusionIT (domain create-then-search) - LineageImpactAnalysisIT (owner removal reflected in search) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:39:11 +00:00 · 2026-05-09 23:21:38 -07:00 · 2026-05-09 23:21:38 -07:00 · 41489056ff
commit 41489056ff
parent a6a6cb9ffd
1 changed files with 9 additions and 5 deletions
--- a/openmetadata-service/src/main/java/org/openmetadata/service/cache/CacheConfig.java
+++ b/openmetadata-service/src/main/java/org/openmetadata/service/cache/CacheConfig.java
@ -19,11 +19,15 @@ public class CacheConfig {
  public int relationshipTtlSeconds = 3600; // 1 hour
  public int tagTtlSeconds = 3600; // 1 hour

-  // /api/v1/search/query response cache. Short TTL because search hits ES which usually
-  // has its own request cache; 30s catches the typical "user types and re-searches the
-  // same thing within a minute" pattern without serving badly stale results after writes.
-  // Set to 0 to disable.
-  public int searchTtlSeconds = 30;
+  // /api/v1/search/query response cache. Very short TTL because search results must
+  // reflect entity writes promptly: user creates X → searches for X → expects to find X.
+  // The integration suite caught this with a 30s TTL: tests that create an entity and
+  // wait for it in search timed out because the cache served the pre-create empty
+  // result for the full 30s. 2s caps that staleness while still catching the typical
+  // UI pattern where multiple components in the same render frame fire identical search
+  // queries (those happen within milliseconds, well inside any reasonable TTL). Set to
+  // 0 to disable.
+  public int searchTtlSeconds = 2;

  // /api/v1/lineage/* response cache. Hybrid TTL + direct-invalidation strategy: a 60s TTL
  // backstops cases where a transitive change (an entity deep in the cached graph) wasn't