OpenMetadata/bootstrap/sql/migrations/native
sonika-shah 17c3b8b9b2
fix(profiler): N+1 / missing-index regression on /tables/.../columns?fields=profile (#3488) (#27746)
* fix(profiler): N+1 / missing-index regression on /tables/.../columns?fields=profile (#3488)

Root cause
----------
The 1.9.9 migration introduced two separate index regressions on
`profiler_data_time_series`:

1. **PostgreSQL**: `schemaChanges.sql` explicitly dropped the unique
   constraint `profiler_data_time_series_unique_hash_extension_ts`
   (entityFQNHash, extension, operation, timestamp) to allow altering the
   generated `operation` column expression, but never recreated it.  After
   the migration the table kept only the `(extension, timestamp)` index,
   which is useless for queries filtering by `entityFQNHash`.

2. **MySQL/both**: `postDataMigrationSQLScript.sql` created temporary indexes
   (idx_pdts_entityFQNHash, idx_pdts_composite, etc.) for its bulk UPDATE
   pass and then dropped **all** of them, including the only index covering
   `entityFQNHash`.

The batch query issued by `getLatestExtensionsBatch()` when
`fields=profile` is requested:

  SELECT entityFQNHash, MAX(timestamp) FROM profiler_data_time_series
  WHERE entityFQNHash IN (...N hashes...) AND extension = 'table.columnProfile'
  GROUP BY entityFQNHash

required an `(entityFQNHash, extension, timestamp)` index.  Without it the
database performs a full table scan.  On production deployments with
millions of profiler rows this caused 100+ second response times (Grafana:
106 770 ms; 99 % in DB; 93 dbOps).  Without `profile` in the fields param
the same endpoint returned in ~150-220 ms.

A secondary N+1 bug existed independently of the index: `customMetrics`
in fields called `getCustomMetrics(table, column)` once per paginated
column, issuing up to N identical queries against `entity_extension` and
then filtering in Java.

Fix
---
* **migration 2.0.2** (MySQL + PostgreSQL): `CREATE INDEX IF NOT EXISTS
  idx_pdts_fqnhash_ext_ts ON profiler_data_time_series(entityFQNHash,
  extension, timestamp)`.  The `IF NOT EXISTS` guard makes the migration
  safe to re-run and handles both upgrade and fresh-install paths.

* **`getTableColumnsInternal`** — `customMetrics` block: fetch all column
  custom metrics for the table in one query, group by column name in Java,
  then distribute.  Reduces N queries to 1.

* **`getTableColumnsInternal`** — `profile` block: skip the duplicate
  `populateEntityFieldTags` call when `tags` was already fetched earlier in
  the same request, saving one prefix-scan on `tag_usage` per request.

Related: PR #26855 (fixed N+1 tag queries on the list-tables path but left
the profiler-index and customMetrics N+1 untouched on the columns sub-path).

* fix(profiler): restore unique constraint on profiler_data_time_series + batch column extension/customMetrics fetch

Move the migration from 2.0.2/ to 1.12.8/ and switch from a non-unique
covering index to restoring the original unique constraint dropped in
1.9.9. The two-phase CREATE UNIQUE INDEX CONCURRENTLY + ADD CONSTRAINT
USING INDEX pattern avoids the ACCESS EXCLUSIVE lock on the hot
profiler_data_time_series table during the upgrade. Closes the 1.9.9
regression and brings Postgres back in line with MySQL (which never lost
the constraint). The leading (entityFQNHash, extension) prefix serves
the column-profile batch query — same shape MySQL has been running
without 504s. MySQL needs no migration.

Java side, eliminates two more N+1 patterns that compound the latency at
customer scale:

* getTableColumnsInternal extension block: replaced per-column
  getColumnExtension() loop with a single getExtensionsByJsonSchema()
  call, grouped by column FQN-hash in Java.
* searchTableColumnsInternal customMetrics block: applied the same
  batch-fetch pattern already used in getTableColumnsInternal, replacing
  per-column getCustomMetrics() with one getExtensions() call.

New DAO method on EntityExtensionDAO:
  getExtensionsByJsonSchema(id, jsonSchema) — selects extensions for a
  table id filtered by the jsonschema discriminator. Required because
  column extensions are stored with MD5-hashed extension keys and have
  no shared prefix the existing getExtensions(id, prefix) could use.

* chore(profiler): address review feedback — empty-list literal + accurate test comments

* Replace `new ArrayList<>()` default in `metricsByColumn.getOrDefault(...)`
  with `List.of()` at both call sites in `TableRepository` (getTableColumnsInternal
  and searchTableColumnsInternal). `getOrDefault` evaluates its default eagerly,
  so the new ArrayList allocates per-column even when the key is found —
  unnecessary work on a hot path.

* Reword two stale test comments in `test_getColumnsWithProfileField_correctnessAndNoBatchRegression`:
  - "all four field combinations" → "the three field combinations exercised below"
  - "(c) duplicate populateEntityFieldTags must not run twice" → describe the
    observable contract the assertions actually verify (tags + profile both
    present), not the internal call count.

* fix(profiler): force outer index scan in getLatestExtensionsBatch by pushing IN list to the join

The getLatestExtensionsBatch query was the right shape for correctness but
the planner — on Postgres at customer scale, with the new unique constraint
in place — was still choosing a parallel sequential scan over the full
profiler_data_time_series table for the outer side of the JOIN, rather than
a merge join with index scan on both sides.

Inner subquery: filtered by `entityFQNHash IN (...)`, used the index.
Outer: only filtered by `p.extension = :extension`, no IN list, planner
couldn't infer the transitive constraint that p.entityFQNHash must equal
one of the inner hashes (because it's enforced through the JOIN ON clause,
not a WHERE predicate). Result: full table scan reading 6.7M+ rows even
when the actual answer is 23 rows.

Adding the redundant `AND p.entityFQNHash IN (<entityFQNHashes>)` to the
outer WHERE makes the constraint explicit. The result set is unchanged
(implied by the join condition), but the planner can now use the unique
index for the outer access too.

Verified on the AUT dump (6.94M-row pdts):
  EXPLAIN of the batch query: 7,234ms → 79ms (Hash Join + Parallel Seq
  Scan → Merge Join + Index Only Scan).
  Live API /columns?fields=profile&include=all: 6-36 seconds → 22-28ms
  (warm) / 1.9s (very first call). 250-1000x improvement, depending on
  cache state.

Same SQL works on both engines; no @ConnectionAwareSqlQuery split needed.

* test(profiler): shorten classification/tag fixture names in IT to fit varchar(256)

The IT fixture for test_getColumnsWithProfileField_correctnessAndNoBatchRegression
was building a tagFQN of `<classification>.<tag>` where each part went through
TestNamespace.prefix(). With the descriptive method name (62 chars) + class
name (15 chars) + namespace UUID (32 chars) plus the `profile_test_cls` /
`profile_test_tag` base names (16 chars each), the resulting tagFQN was 263
characters — over the tag_usage.tagFQN VARCHAR(256) limit:

  ERROR: value too long for type character varying(256)

Shorten the fixture base names from `profile_test_cls`/`profile_test_tag` to
`cls`/`tag`. The namespace prefix already encodes test isolation (class +
method + UUID), so the base name doesn't need to repeat that context.

New tagFQN length: 237 chars (cls__<32>__TableResourceIT__<62>.tag__<32>__TableResourceIT__<62>),
comfortably under 256.

* fix(table): include extensionKey in column-extension deserialize warn log

Addresses gitar-bot review on PR #27746: the warning log on failed column-
extension deserialization only had table.getId(), so operators could not
pinpoint which row was bad. Add record.extensionName() (the entity_extension
row key) to the log. No extra iteration - record is already in scope inside
the catch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(migration): move profiler unique-constraint migration to 1.12.9

1.12.8 was already published with the PII classification fix from #27910.
Move the profiler_data_time_series unique-constraint restore (this PR's
postgres migration) to 1.12.9 so customers upgrading past the published
1.12.8 still pick it up.

Add a MySQL placeholder schemaChanges.sql for 1.12.9 consistent with the
1.12.7 convention — MySQL was unaffected by the 1.9.9 regression
(MODIFY COLUMN re-evaluates generated expressions in place without
touching the constraint, so MySQL still has the constraint from 1.1.5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(table): extract batchFetchCustomMetricsByColumn helper

Addresses PR #27746 Copilot review:
- Dedupe custom-metric batch logic between getTableColumnsInternal and
  searchTableColumnsInternal.
- Reword IT inline comment to reflect what the test actually validates
  (completes within timeout + correct profiles) instead of claiming it
  inspects query plans.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-05-14 06:56:39 -07:00
..
1.1.0 Fix postgres migration files (#12923) 2023-08-18 14:54:43 +02:00
1.1.1 Fix postgres migration files (#12923) 2023-08-18 14:54:43 +02:00
1.1.2 Issue 8930 - Update profiler timestamp from seconds to milliseconds (#12948) 2023-08-25 08:47:16 +02:00
1.1.5 only add collation to hash columns (#13201) 2023-09-15 12:49:11 +05:30
1.1.6 Add 1.1.6 migrations dir (#13305) 2023-09-22 09:45:00 +02:00
1.1.7 Prep v1.1.7 migrations to address test cases & suites (#13345) 2023-09-27 11:49:21 +02:00
1.2.0 Migration Fixes (#16131) 2024-05-07 22:07:25 +05:30
1.2.1 fix: comment in sql migration (#13979) 2023-11-15 10:32:11 +01:00
1.2.3 Minor: Fix migration location for unity catalog (#14339) 2024-01-03 18:26:11 +05:30
1.2.4 Fix #13982: Fix userFQN encoding while creating mentions (#14496) 2023-12-25 17:28:13 -08:00
1.3.0 Migration Fixes (#16131) 2024-05-07 22:07:25 +05:30
1.3.1 fix: move migration to 1.3.1 (#15463) 2024-03-05 15:30:43 +01:00
1.3.2 Remove SQls from 1.3.2 (#15917) 2024-04-16 18:51:03 +05:30
1.3.3 Move migration for apps to 1.3.3 all together (#15944) 2024-04-18 14:26:05 +05:30
1.4.0 ISSUE #2681 - Add Missing test parameters in PSQL (#25323) 2026-01-16 12:09:15 +01:00
1.4.2 Fix Test Suite Filter (#16615) 2024-06-12 10:40:05 +05:30
1.4.4 Fix #16788: Optimize feed query performance issues introduced in 1.4.2 (#16862) 2024-07-01 19:58:47 -07:00
1.4.5 MINOR - Clean automations_workflow in 1.4.5 (#17006) 2024-07-12 13:54:46 +02:00
1.4.6 Move Migration to 1.4.6 (#17095) 2024-07-19 12:16:53 +05:30
1.4.7 Migrate NameHash (#17317) 2024-08-06 18:41:37 +05:30
1.5.0 Improve count/feed api performance for 1.5 (#17576) 2024-08-23 11:20:34 -07:00
1.5.6 [Search] Indexing Fixes (#18048) 2024-09-30 23:39:27 +05:30
1.5.7 migration: fix duplicate param key insertion (#20802) 2025-04-15 14:10:51 +02:00
1.5.9 MINOR - Move appName migration to 1.5.9 (#18435) 2024-10-28 16:29:56 +01:00
1.5.11 Fix Search Index Contention (#18605) 2024-11-12 20:36:23 +05:30
1.5.15 Domain Policy Update to be non-system (#19060) 2024-12-15 01:18:12 +05:30
1.6.0 Feat# Implementation of Custom Workflows (#23023) 2025-10-08 18:57:44 +05:30
1.6.2 Improvement #19065 : Support removing existing enumKeys (for enum type custom property) (#19054) 2025-01-07 19:25:59 -08:00
1.6.3 Cleanup App data (#19571) 2025-01-28 19:22:33 +05:30
1.6.7 MINOR: chore: added missing timestamp indexes for time series tables (#20373) 2025-03-24 07:43:07 +01:00
1.7.0 Add cleanup apps_extension_time_series (#20857) 2025-04-16 14:54:11 +05:30
1.7.1 Escape ? to causing issues in jdbi binding (#21381) 2025-05-23 17:13:45 +05:30
1.7.2 FIX - Automation Workflows should not be updated by the SM & cleanup migration (#21435) 2025-06-03 12:17:14 +02:00
1.7.4 Disabled bot creating activity feeds (#21773) 2025-06-14 19:21:00 +05:30
1.8.0 Add Data Contracts Specification and APIs (#21164) 2025-06-04 06:36:28 +02:00
1.8.1 Fix #20621: User Status Tracking in the System (#21911) 2025-07-02 14:37:36 -07:00
1.8.2 Fix #20145: Implemented Prefix For Dashboard Service (#21585) 2025-07-08 18:54:35 +02:00
1.8.4 MINOR - Add columns.description in search settings (#22299) 2025-07-15 09:21:57 +02:00
1.8.5 Added missing migration sql files [1.8.5 and 1.10.2] (#24399) 2025-11-18 08:02:35 +01:00
1.8.7 Feature: Security Service (#22450) 2025-07-31 06:38:21 +02:00
1.8.8 Feature: Security Service (#22450) 2025-07-31 06:38:21 +02:00
1.8.9 Feature: Security Service (#22450) 2025-07-31 06:38:21 +02:00
1.9.0 MINOR - Add Tests & fix migrations (#22714) 2025-08-03 15:19:54 +02:00
1.9.2 Add missing domain migrations for entity version history (#23032) 2025-08-21 14:33:37 +05:30
1.9.5 MINOR - Move migrations to 1.9.5 (#23095) 2025-08-28 09:23:21 +02:00
1.9.6 ISSUE #1534 - Profiler Refactor for Metadata Extraction Application (#23200) 2025-09-05 13:07:04 +02:00
1.9.9 Minor fix broken 1.9.8 migrations (#23487) 2025-09-22 13:13:25 +00:00
1.9.10 Fixes #23356: Databricks & UnityCatalog OAuth and Azure AD Auth (#23561) 2025-10-03 19:53:19 +05:30
1.9.11 add entityType.keyword aggregation in searchSettings.json (#23559) 2025-09-25 17:04:49 +05:30
1.10.0 Move migrations to 1.11.x (#24074) 2025-10-30 01:02:45 +05:30
1.10.2 Added missing migration sql files [1.8.5 and 1.10.2] (#24399) 2025-11-18 08:02:35 +01:00
1.10.3 MINOR: dbt migration fix (#23980) 2025-10-23 12:54:34 +02:00
1.10.4 chore: move dbt migration to 1.11 (#24076) 2025-11-03 08:46:47 +01:00
1.10.5 TRUNCATE Flowable History Tables in both 1.10.5 and 1.10.7 Migration (#24323) 2025-11-13 21:05:31 +00:00
1.10.6 Fixes #24132: Airbyte Cloud Support (#24261) 2025-11-11 16:24:09 +05:30
1.10.7 TRUNCATE Flowable History Tables in both 1.10.5 and 1.10.7 Migration (#24323) 2025-11-13 21:05:31 +00:00
1.10.8 Fix email configuration templates default value from 'collate' to 'openmetadata' (#24352) 2025-11-17 08:39:41 +01:00
1.11.0 Moved AI Application and LLM Model entities migrations to 1.12.0 (#25659) 2026-02-02 08:50:37 +01:00
1.11.1 chore: realign main migration with 1.11.1 branch (#24938) 2025-12-22 09:03:28 +01:00
1.11.2 Fix #24578: Datamodels not visible if . in service name (#24779) 2025-12-27 10:00:26 -08:00
1.11.4 Fix search percentile rank scoring (#24859) 2025-12-23 18:06:27 +00:00
1.11.5 Tagging explanation (#24817) 2026-01-08 17:02:40 +01:00
1.11.6 Fix: remove overrideLineage config from database service metadata pipeline (#25379) 2026-01-20 09:08:26 +05:30
1.11.8 Fixes #24546: Add sobjectNames field for multi-object selection in Salesforce connector (#24547) 2026-02-02 16:05:59 +01:00
1.11.9 Add bulk apis for pipeline status (#25731) 2026-02-10 18:14:06 +05:30
1.11.11 Fix-20713: Add support for metadata ingestion using local file in REST connector (#26036) 2026-02-23 21:50:26 +05:30
1.11.12 Fix #26178: Add support for IAM auth for redshift (#26179) 2026-03-02 21:57:28 +05:30
1.12.0 fix(lineage): service nodes appearing in entity lineage view and empty By Service view (#27258) 2026-04-17 00:55:16 -07:00
1.12.1 Continuous indexing to handle failures (#26111) 2026-03-18 16:23:04 +05:30
1.12.2 Fixes #26225: Add index and FORCE INDEX for listLastTestCaseResultsForTestSuite (MySQL) (#26235) 2026-03-06 07:55:41 -08:00
1.12.4 Move Migration to 1.12.4 from 1.12.3 (#26629) 2026-03-20 09:41:15 +00:00
1.12.5 Update indexing schedule (#27204) 2026-04-10 19:15:08 +05:30
1.12.6 fix(lineage): service nodes appearing in entity lineage view and empty By Service view (#27258) 2026-04-17 00:55:16 -07:00
1.12.7 Fixes #27158: ingestion slowdown from tag_usage seq-scan on Postgres (#27745) 2026-05-05 10:30:11 +05:30
1.12.8 Add migrations to ensure PII are really enabled (#27921) 2026-05-08 15:39:29 +00:00
1.12.9 fix(profiler): N+1 / missing-index regression on /tables/.../columns?fields=profile (#3488) (#27746) 2026-05-14 06:56:39 -07:00
1.13.0 Add migrations to ensure PII are really enabled (#27921) 2026-05-08 15:39:29 +00:00
2.0.0 Context center (#27558) 2026-05-08 10:56:04 -07:00
2.0.1 Task redesign (#25894) 2026-04-23 15:52:30 +02:00