Commit graph

16575 commits

Author SHA1 Message Date
Karan Hotchandani
43da034b6b
Merge 988d3acfa0 into 65149e1e34 2026-05-24 09:39:00 +00:00
Anujkumar Yadav
65149e1e34
Add id patch for data mode entity update (#28393)
Some checks failed
Java Checkstyle / java-checkstyle (push) Has been cancelled
Maven Collate Tests / maven-collate-ci (push) Has been cancelled
OpenMetadata Service Unit Tests / Detect Changes (push) Has been cancelled
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Has been cancelled
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Has been cancelled
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Has been cancelled
2026-05-24 10:55:12 +05:30
Rohit Jain
82430c773b
Redesign quick link modal (#28390)
* Redesign quick link modal

* lint fix

* addressed gitar comment

* lint fix

* fixed unit test

* lint fix
2026-05-24 10:16:26 +05:30
Rohit Jain
42881cc0fc
Context center feedbacks (#28386)
Some checks are pending
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
* Addressed Context center feedbacks

* lint fix

* addressed gitar comment

* lint fix

* fixed playwright and unit test
2026-05-23 22:02:24 +05:30
Anujkumar Yadav
f6c1c76a0b
test: fix flacky test for ontology behaviour (#28385)
Some checks are pending
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
* test: fix flacky test for ontology behaviour

* nit

* nit

* fix lint issue

* fix flacky relation graph test
2026-05-23 08:26:53 +00:00
Satender K
b38b386d64
Fixes 26694 (#28330)
Some checks are pending
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
* added fix for 26694

* updated code as per Gitar comment

* added E2E test cases

---------

Co-authored-by: Satender <sommy@Satenders-MacBook-Pro.local>
2026-05-22 19:20:33 +00:00
Pere Miquel Brull
c9a5bdb1f0
fix(search): enable vector embeddings on context_memory_search_index (#28374)
Some checks are pending
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
Add the embedding fields (fingerprint, textToEmbed, chunkIndex,
chunkCount, parentId) to all locale variants of the context_memory
mapping and include dataAssetEmbeddings in its parentAliases.
Without fingerprint, OsUtils.addKnnVectorSettings() returned early
and never injected the knn_vector embedding field; without the
alias, vector search at /dataAssetEmbeddings/_search never fanned
out to the context memory index. Both gates are now satisfied so
ContextMemory participates in semantic search alongside tables,
glossary terms, and knowledge pages.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 17:35:24 +02:00
Chirag Madlani
dbb0737ec9
fix(tsconfig): remove baseUrl option from compilerOptions (#28359)
* fix(tsconfig): remove baseUrl option from compilerOptions

* update imports to relative

* update missing import

* apply checkstyle
2026-05-22 19:37:47 +05:30
Siddhi Gupta
6f83626298
Fixes #26970: fix bot name search on the Bots page (#27365)
* Fix bot name matching on the Bots page search

* Fixes #26970: migrate Bots search to API-based name/email lookup with robust partial email matching

* Fix flaky Playwright setup by making Table/User creation idempotent and hardening glossary/tag and cleanup flows

* Fixes #26970: avoid full bot scan by switching search result resolution to direct getBotByName lookups

* Fixes #26970: align bot user search with deleted toggle and tighten tour retry timeout handling

* Fixes #26970: keep bot search API-driven, align wildcard matching with name/email expectations, and revert unrelated Playwright changes

* fix: stabilize bot search behavior and flaky Playwright flows across bots, glossary, lineage, and announcements

* fix: limit PR scope to bot search by reverting unrelated Tour Playwright changes

* fix: remove unrelated Playwright changes and keep bot search scope focused

* fix: optimize bot search scalability with paginated user-index retrieval and bounded-concurrency bot resolution

* fix: harden Bots API search with bounded pagination/concurrency and consistent active-search refresh behavior

* fix: prevent stale bot search state and strengthen Bots Playwright coverage with deterministic positive/negative assertions

* test: scope Playwright fixes to bot flow and remove unrelated test changes

* Add local bot search and stabilize tests

* Fixes: keep Bot search API-driven for complete results and stabilize bot cleanup assertions in Playwright

* chore: revert out-of-scope bot Playwright test changes

* test: add bot search e2e coverage and tighten bot API response assertions

* test: stabilize bot search no-match assertions using filter placeholder testid

* fix: add search API wait in bot Playwright flow and refactor bot-user mapping helper

* fix: extract reusable searchbar helper with search API wait and deduplicate bot user mapping logic

* fix(playwright): make bot search test stable by removing brittle API wait

* Refactor bot search integration and Playwright synchronization for reliable name/email query behavior

* Fix bot search regressions by preserving getBotByName compatibility export, stabilizing BotListV1 memoized enrichment, and skipping Playwright search API wait for empty terms

* fix: stabilize bot search Playwright API wait to prevent encoded-query timeout flakes

* } from '../generated/api/teams/createUser';

* fix: eliminate Playwright search response race by pre-registering waiter and matching query-specific GET /search/query responses

* fix: normalize bot search queryFilter format and harden bot-user resolution flow

* refactor code

* fix bot search

* fix checkstyle

* fix display name search

* address gitar

* remove unwanted code

* address gitar and improve performance

* address gitar

* fix bot spec

---------

Co-authored-by: Harsh Vador <58542468+harsh-vador@users.noreply.github.com>
Co-authored-by: Harsh Vador <harsh.vador@somaiya.edu>
2026-05-22 19:09:54 +05:30
Anujkumar Yadav
46e0bcea30
test: Update data consumer policies role (#28366) 2026-05-22 11:26:50 +00:00
Sid
b008f9638a
fix(playwright): await PATCH + modal detach in deleteCreatedProperty (#28361)
The util clicked the Delete Property confirm save-button and returned
immediately. Custom property deletion is implemented in CustomPropertiesPageV1
as a JSON-patch on the Type entity (rest/metadataTypeAPI.ts updateType ->
PATCH /api/v1/metadata/types/{id}). While that PATCH was in flight the
ConfirmationModal stayed mounted with its Confirm button in a `loading`
state, leaving an <ant-modal-wrap ant-modal-centered> in the DOM that
intercepted pointer events. Callers like GlossaryImportExport.spec.ts loop
through deleteCreatedProperty + settingClick(GLOSSARY_TERM); on slow AUT
runs the leftover modal mask intercepted the next iteration's click on
[data-testid="app-bar-item-settings"] and the 180s test budget was burned
waiting for the sidebar item to become actionable. PR #27952 addressed the
wrong modal; the trace in nightly run aut/26261444729 shows the visible
dialog is the Delete Property confirm, not the version-history drawer.

Await the PATCH 200 on /metadata/types/ and assert the modal's body-text
has unmounted. ConfirmationModal uses destroyOnClose, so body-text detach
is the cleanest signal that the mask is gone. save-button cannot be used
for the detach assertion because its testid briefly swaps to loading-button
while the PATCH is in flight.

Co-authored-by: Siddhant <siddhant@MacBook-Pro-751.local>
2026-05-22 09:06:15 +00:00
Mohit Yadav
69b4ab57ab
fix(search): keep usageSummary in search across reindex + live updates (#28350)
Some checks failed
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
Integration Tests - MySQL + Elasticsearch / Detect Changes (push) Has been cancelled
Integration Tests - PostgreSQL + Elasticsearch + Redis / Detect Changes (push) Has been cancelled
Integration Tests - PostgreSQL + OpenSearch / Detect Changes (push) Has been cancelled
Publish Package to Maven Central Repository / publish-maven-packages (push) Has been cancelled
Integration Tests - MySQL + Elasticsearch / integration-tests-mysql-elasticsearch (push) Has been cancelled
Integration Tests - PostgreSQL + Elasticsearch + Redis / integration-tests-postgres-elasticsearch-redis (push) Has been cancelled
Integration Tests - PostgreSQL + OpenSearch / integration-tests-postgres-opensearch (push) Has been cancelled
* fix(search): index usageSummary so reindex preserves Explore weekly-usage sort

TableIndex.getRequiredReindexFields() declared "columns" but not
"usageSummary". usageSummary is fields-gated in TableRepository
(clearFields nulls it unless requested), so the reindex path — which
fetches only the declared required fields — dropped it from the table
search document's _source. Explore's "Sort by Weekly Usage" reads
_source.usageSummary.weeklyStats.count, so it silently broke after any
full reindex even though live-served docs looked fine.

Add "usageSummary" to the required reindex field set so the reindexed
document carries it, matching the live entity payload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(search): update search index on usage report (live path)

Usage is recorded via direct DAO writes in UsageRepository, bypassing
EntityRepository.update — so the entity-lifecycle SearchIndexHandler
never fires, and a reported usage never reached the search document.
The search doc kept a stale/absent usageSummary until the next full
reindex, so Explore "Sort by Weekly Usage" didn't reflect freshly
reported usage live.

After recording usage, push the refreshed entity into the search index
(updateEntity re-fetches with all fields, so usageSummary is included).
Table usage rolls up to its schema + database, so refresh those docs
too. Search failures are logged, not propagated — the usage write is
already committed.

Together with the reindex-fields change this keeps usageSummary present
in _source on both the live and reindex paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(search): use index required fields for usage reindex, not "*"

The live-path usage search update called SearchRepository.updateEntity,
which re-fetches the entity with getFields("*"). For the rolled-up
database/schema that hydrates every child table — tens of thousands on
large catalogs — risking OOM, the exact over-fetch the reindex-fields
work exists to avoid.

Fetch each affected entity (table, schema, database) with only its
index's required reindex fields via ReindexingUtil.getSearchIndexFields,
then updateEntityIndex(entity) directly. Mirrors the reindex pipeline's
field selection and keeps the rollup cheap and bounded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(search): bound updateEntity field fetch to index required fields

SearchRepository.updateEntity(EntityReference) re-fetched the entity with
getFields("*") before re-indexing. This runs on every live entity update
(SearchIndexHandler.onEntityUpdated) and tag-propagation refresh, for all
entity types. For container entities (database/schema) "*" hydrates every
child — tens of thousands of tables on large catalogs — so a single live
update could OOM the server. That's a far bigger blast radius than the
usage path alone.

Fetch only the fields the entity's search index declares as required
(searchIndexFactory.getReindexFieldsFor) — the same set the reindex
pipeline uses — so live updates and reindex stay consistent and bounded.

Reverts the per-call workaround in UsageRepository.updateUsageInSearch
back to plain updateEntity calls, now that updateEntity itself is bounded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(usage): switch expression + drop schema/database live refresh

Address PR #28350 review:
- addUsage: convert the break/mutable-variable switch to a Java 21
  switch expression (no mutable response, no break boilerplate).
- updateUsageInSearch: only refresh the reported entity's search doc.
  Dropped the cascade to the rolled-up schema + database — usage
  reporting can be high-volume and the table doc is the surface that
  matters ("Sort by Weekly Usage"); schema/database usageSummary
  reconciles on the next reindex. This also removes the redundant
  second table fetch and the unguarded schema/database refs the bots
  flagged. The single updateEntity call is bounded (required fields).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Use safe list

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 10:15:56 +02:00
dependabot[bot]
c59c85c354
build(deps): bump js-cookie in /openmetadata-ui/src/main/resources/ui (#28356)
Bumps [js-cookie](https://github.com/js-cookie/js-cookie) from 3.0.5 to 3.0.7.
- [Release notes](https://github.com/js-cookie/js-cookie/releases)
- [Commits](https://github.com/js-cookie/js-cookie/compare/v3.0.5...v3.0.7)

---
updated-dependencies:
- dependency-name: js-cookie
  dependency-version: 3.0.7
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Harsh Vador <58542468+harsh-vador@users.noreply.github.com>
2026-05-22 08:10:20 +00:00
Anujkumar Yadav
c7ad6b52a1
test: Added missing playwright test for ontology and related terms (#28358)
* test: Added missing playwright test for ontology and related terms

* fix lint checks
2026-05-22 08:05:40 +00:00
Adrià Manero
0968c8f026
Fix #27918: notification links plural alerts path and Query href fallback (#28335)
* Fix #27918: notification links — plural alerts path and Query href

* Update NotificationTemplateHelperAdvancedTest Query assertion for new /query-view URL
2026-05-22 08:47:05 +02:00
Sid
28ffe76785
fix(alerts): guard isBot() against deleted users to avoid aborting change-event batches (#28304)
* fix(alerts): guard isBot() against deleted users to avoid aborting batches

AlertsRuleEvaluator.isBot() called Entity.getEntityByName with
Include.NON_DELETED without catching EntityNotFoundException. When a
change event's userName referenced a user that had been deleted (common
for short-lived test fixtures torn down by afterAll), the exception
escaped the SpEL filter and was caught in AbstractEventConsumer.execute
as "Error in polling events for alert : Entity not found: user ...".

That outer catch logs the error and lets the finally block advance the
offset by batchSize, silently dropping every other event in the batch.
The ActivityFeedAlert subscription was hit hardest because every event
flows through its isBot() filter rule.

Catch EntityNotFoundException and treat an unresolvable actor as not-a-bot.

* fix(alerts): null-safe getIsBot() + IT + unblock ActivityAPI spec

- AlertsRuleEvaluator.isBot(): use Boolean.TRUE.equals(user.getIsBot())
  so a null isBot field on the resolved user doesn't NPE on auto-unbox.
- Add AlertsRuleEvaluatorResourceIT#test_isBot_returnsFalseWhenActorUserDeleted
  covering the original bug — create a user, evaluate isBot() (false),
  delete the user, evaluate isBot() again and assert it still returns
  false instead of throwing.
- Remove test.describe.fixme from ActivityAPI.spec.ts so the spec runs
  again now that the underlying batch-abort is fixed.

---------

Co-authored-by: Siddhant <siddhant@MacBook-Pro-751.local>
2026-05-22 12:08:22 +05:30
sonika-shah
dbc0d997bb
fix(table,dataModel): persist inline column.extension from POST/PUT-create (#28328)
POST/PUT-create with column.extension inlined in the request body never
wrote a row to entity_extension — the data only landed in the table JSON,
while every reader (and the PATCH original reload) looks in entity_extension
via getColumnExtension(). The override clobbered the inline data with null
on every read, leaving the Custom Properties tab blank and causing any
JsonPatch op walking /columns/N/extension/... to fail with "no mapping for
the name 'extension'" on the reloaded original.

One shared single-column writer for both create and update paths:

- Add EntityRepository.storeColumnExtension(UUID, Column) — the upsert that
  used to live inside EntityUpdater.updateColumnExtension, lifted out so the
  create paths can reach it.
- Add EntityRepository.storeColumnExtensions(UUID, List<Column>) — single
  entity walker that flattens via EntityUtil.getFlattenedEntityField and
  calls the primitive per column.
- Add EntityRepository.getColumnsForExtensionPersistence(T) hook overridden
  in TableRepository and DashboardDataModelRepository to return getColumns().
- Wire into createNewEntityFlush alongside storeExtension.

Update path simplifications:

- updateColumns now uses the shared storeColumnExtension primitive.
- updateColumnExtension (the EntityUpdater-private duplicate writer) is
  deleted.
- Existing-column branch skips the upsert when stored.extension equals
  updated.extension — previously every update pass rewrote identical JSON
  for every column with a non-null extension.
- Added-columns branch now persists extensions via the walker. updateColumns'
  main loop skips columns with no stored match, so inline column.extension
  on a freshly-added column was silently dropped on PUT-update too.

Regression coverage in ColumnCustomPropertiesIT covers both tableColumn
and dashboardDataModelColumn inline-on-create persistence, plus the
PUT-update-adds-column path.
2026-05-22 12:00:23 +05:30
Sriharsha Chintalapani
b62db6224f
feat(tasks): policy-driven authorization with self-approval guard (#28315)
* feat(tasks): policy-driven authorization with self-approval guard

Moves Task resolve/close/reassign authorization from ~150 lines of custom
Java in TaskRepository into the policy engine. Adds ResolveTask, CloseTask,
ReassignTask MetadataOperation values, isTaskFiler/isTaskAssignee/isTaskReviewer
SpEL conditions, and a new TaskAuthorPolicy seed. Closes the self-approval
gap where a filer who was also in the assignees list could approve their
own task (now denied via deny rule). TaskResourceContext.getOwners now
returns target entity owners so isOwner() retains its conventional meaning;
v200 migration backfills the new policy attachment on the DataConsumer role
for upgrades.
2026-05-21 22:20:46 -07:00
Anujkumar Yadav
f3b9f2167e
Edge cardinality text (#28349)
* Add cardinality based text beside the ontology node

* fix checkstyle issue

* fix lint checks

* Fix playwright test

* fix lint issue

* fix lint issue

* nit
2026-05-22 10:34:16 +05:30
Sriharsha Chintalapani
1dcf8dd60f
MCP Tool Usage (#28352)
* MCP Tool Usage

* Update generated TypeScript types

* Address PR review feedback on MCP usage tracking

Reorder UA heuristic so VS Code wins over Claude CLI for composite
User-Agents, refactor to a predicate list, and sanitise the resolved
client name (trim, strip control chars, cap at 64 chars). Bound the
schema field to match.

Bound the latency aggregation lists in McpUsageResource with reservoir
sampling so summary/per-tool percentile estimates stay valid without
unbounded heap growth. Skip null-timestamp rows in the history loop and
update the stale /history Swagger description to reflect the ok/fail
shape. Convert CallToolOutcome to a Java record and update the recorder
flow to use accessor methods.

Fix the pre-existing regression in McpImpersonationTest where the mock
still wired the legacy callTool path. Add DefaultToolContextTest with
direct coverage for classifyException (all four ErrorCategory buckets,
cause-chain walk, null message in chain) and the unknown-tool outcome.
2026-05-21 21:55:38 -07:00
Chirag Madlani
2d585c708f
feat(constants): add temporary lineage node height and base node heig… (#28346)
* feat(constants): add temporary lineage node height and base node height constants
fix(CanvasUtils): update node height calculations and improve edge coordinate logic

* fix import error

* fix import issue which cause issue to ref
2026-05-22 10:13:10 +05:30
Rohit Jain
a3476c2de2
Context center import fix (#28351)
Some checks are pending
Integration Tests - MySQL + Elasticsearch / Detect Changes (push) Waiting to run
Integration Tests - MySQL + Elasticsearch / integration-tests-mysql-elasticsearch (push) Blocked by required conditions
Integration Tests - PostgreSQL + Elasticsearch + Redis / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + Elasticsearch + Redis / integration-tests-postgres-elasticsearch-redis (push) Blocked by required conditions
Integration Tests - PostgreSQL + OpenSearch / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + OpenSearch / integration-tests-postgres-opensearch (push) Blocked by required conditions
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
Publish Package to Maven Central Repository / publish-maven-packages (push) Waiting to run
2026-05-21 23:19:07 +00:00
Rohit Jain
6bfaa4a59d
Context memories (#28344)
* Added context memories UI in context center

* lint and translations fix

* fixed ui issues

* lint fix

* fixed the translations

* addressed gitar comment

* lint fix

* addressed copilot comments

* addressed gitar comment
2026-05-21 23:43:11 +05:30
Mohit Yadav
81d30a7498
Fix Failing LineageImpactAnlaysisTest (#28340) 2026-05-21 11:10:25 -07:00
Sriharsha Chintalapani
42546feb15
fix(reindex): batch-prefetch upstream lineage to stop Hikari connection leak (#28311)
* fix(reindex): batch-prefetch upstream lineage off doc-build threads

Reindex doc-build executor runs 50 virtual workers each calling
`getLineageData -> findFrom` per entity, holding a Hikari connection
for the duration. On JDK 21 + HikariCP 7's synchronized borrow path,
those virtual threads get pinned to carriers and stall for >60s, which
fires the connection leak detector and freezes the reindex.
2026-05-21 10:30:58 -07:00
Sriharsha Chintalapani
cdd0b3a0d0
fix: return 400 for malformed JSON Patch pointers instead of 500 (#28316)
Some checks are pending
Integration Tests - MySQL + Elasticsearch / Detect Changes (push) Waiting to run
Integration Tests - MySQL + Elasticsearch / integration-tests-mysql-elasticsearch (push) Blocked by required conditions
Integration Tests - PostgreSQL + Elasticsearch + Redis / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + Elasticsearch + Redis / integration-tests-postgres-elasticsearch-redis (push) Blocked by required conditions
Integration Tests - PostgreSQL + OpenSearch / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + OpenSearch / integration-tests-postgres-opensearch (push) Blocked by required conditions
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
Publish Package to Maven Central Repository / publish-maven-packages (push) Waiting to run
Client patches with paths missing the leading '/' (e.g., "displayName"
instead of "/displayName") triggered jakarta.json.JsonException from
JsonPointerImpl, which fell through the exception mapper and surfaced
as an unhandled 500 (and Sentry alert) on PATCH endpoints such as
ClassificationResource.

- JsonUtils.applyPatch now validates each operation's 'path' and 'from'
  upfront, throwing IllegalArgumentException with a clear RFC 6901
  message before the cryptic library exception fires.
- CatalogGenericExceptionMapper maps jakarta.json.JsonException to 400
  as defense in depth, covering other RFC 6902 violations (e.g.,
  out-of-range array index, replace on missing path) that were also
  returning 500.
- Added JsonUtilsTest cases for malformed 'path' and 'from' pointers.
2026-05-21 07:26:10 -07:00
Mayur Singal
9921dc1389
Fixes #28245: ingest valueless Databricks/Unity Catalog tags (#28294)
* Fixes #28245: ingest valueless Databricks/Unity Catalog tags

Databricks/Unity Catalog exposes system-generated (and some user-defined)
tags as (tag_name, tag_value=null). The connectors mapped tag_name ->
Classification and tag_value -> Tag, so an empty tag_value was either
skipped (Unity Catalog) or coerced to a "NONE" sentinel (Databricks).

When tag_value is empty, fall back to a dedicated per-connector
classification (DATABRICKS_TAGS / UNITY_CATALOG_TAGS) and use tag_name
verbatim as the tag under it (no dot-splitting). Valued tags are
unchanged: classification = tag_name, tag = tag_value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address review: harden valueless-tag mapping

- Treat whitespace-only tag_value as valueless (strip-based check) so it
  falls back to the *_TAGS classification instead of being silently
  dropped downstream by get_ometa_tag_and_classification.
- Skip rows with empty/None tag_name in the Databricks connector, for
  parity with Unity Catalog, so an empty classification name is never
  sent to the API.
- Add tests for whitespace-only tag_value (both connectors) and the
  empty tag_name skip (Databricks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 19:41:03 +05:30
Ram Narayan Balaji
1f7d2cc318
feat(migration): backfill task domains for domain-scoped activity feed visibility (#28013)
* feat(migration): backfill task domains in v1.12.8 for domain-scoped activity feed visibility

Tasks created via approval workflows lacked domains, causing domain-scoped users
to miss them in the activity feed. This migration backfills domains on both
thread_entity (1.12.x) and task_entity (2.x) tables.

- thread_entity: reads tasks where JSON_EXTRACT(json,'$.domains') IS NULL in
  batches, resolves domains from the linked entity, and sets $.domains=[] or
  the real domain list. Uses json->'domains' IS NULL on Postgres.
- task_entity: inserts HAS-relationship rows from existing MENTIONED_IN rows
  via INSERT IGNORE / ON CONFLICT DO NOTHING.
- Failure is non-fatal: logs an error and skips rather than blocking startup.

* fix(migration): correlate NOT EXISTS on fromId to handle multi-domain tasks

* refactor(migration): fix transient-failure caching and extract SQL builder methods

* fix(migration): properly distinguish lookup failures from no-domains; fix counter and skip-on-failure

* feat(migration): move task domain backfill migration from 1.12.8 to 1.12.9

* fix(migration): mark unresolvable thread rows as migrated to prevent infinite loop

Previously, any thread row whose target entity could not be resolved
(malformed JSON, EntityLink parse failure, or non-EntityNotFound errors)
returned null from resolveThreadTaskDomains and was skipped without
updating. Because the batch read uses WHERE \$.domains IS NULL ORDER BY
createdAt LIMIT 500 with no offset, the same failing rows would be
re-fetched in every subsequent batch — once all unaffected rows had
been processed, the loop would spin forever on the remaining failures.

Now any unrecoverable failure marks the row with \$.domains = [] (same
as the legitimate no-domains case), logs a WARN, and increments a
markedDoneOnError counter surfaced at the end of the run. A stall
detector also breaks the loop if a non-empty batch produces zero
updates, guarding against the residual case where the mark UPDATE
itself keeps failing.

Also adds a class-level reindex note: the migration only writes to DB,
so the tasks search index must be rebuilt post-upgrade for the activity
feed (which queries Elasticsearch/OpenSearch) to reflect the backfilled
domains.

* fix(migration): address PR review — non-fatal policy migrations + deleted-row filter

Two issues called out in PR #28013 review:

1. Policy migrations could block server startup. addTriggerOperationToDefaultBotPolicies
   handled errors internally, but addTriggerRuleToDataStewardPolicy only catches
   EntityNotFoundException — a DB error from policyDAO.update() or a
   JsonProcessingException from JsonUtils.pojoToJson() would propagate through
   @SneakyThrows and fail the migration step. Wrap the policy calls in their own
   try-catch (matching the task domain migration's non-fatal pattern) and drop
   @SneakyThrows.

2. INSERT...SELECT for task_entity domain backfill did not filter on
   entity_relationship.deleted. Adds er_about.deleted = FALSE,
   er_domain.deleted = FALSE, and ex.deleted = FALSE to avoid deriving task
   domains from soft-deleted relationships.

Adds two SQL-shape tests asserting the deleted filters are present in both MySQL
and Postgres variants.

* fix(migration): drop ON CONFLICT target so backfill survives PK shape change

The Postgres `entity_relationship` PK is 3 columns on 1.12.x (fromid, toid,
relation) but 4 columns on 2.x (... + relationtype). Naming a 3-column target
in `ON CONFLICT (fromId, toId, relation)` would fail at parse-time on the 4-col
schema with "no unique or exclusion constraint matching the ON CONFLICT
specification", silently breaking the task-domain backfill on forward upgrades.

Use a bare `ON CONFLICT DO NOTHING` — Postgres applies it to any unique/PK
violation, matching MySQL's `INSERT IGNORE` semantics. The `NOT EXISTS` in the
SELECT still prevents intra-statement duplicates; ON CONFLICT is just the
race-safety net.

Test asserts the bare form to prevent regressions reintroducing a column list.

* fix(migration): also match JSON null domains in thread_entity backfill

The WHERE clause only matched rows where $.domains was SQL NULL (key missing),
which left out tasks where Jackson serialized the unset field as "domains": null
explicitly. JSON_EXTRACT / -> returns JSON null in that case, not SQL NULL, so
"IS NULL" did not match.

Repro: a 1.13 task created before CreateApprovalTaskImpl.withDomains(...) shipped
serializes "domains": null. On upgrade to main, v1129 read 9 thread tasks but
only backfilled 2 — the 7 with explicit JSON null were silently skipped, and the
v200 promotion to task_entity then carried that empty state forward.

Broaden the WHERE on both dialects:
  MySQL    : JSON_EXTRACT(json,'$.domains') IS NULL
             OR JSON_TYPE(JSON_EXTRACT(json,'$.domains')) = 'NULL'
  Postgres : json->'domains' IS NULL
             OR jsonb_typeof(json->'domains') = 'null'

After backfill, $.domains is written as an array (empty or populated), so neither
clause matches the updated row — no infinite loop. Tests assert both branches of
the OR for each dialect.

* fix(migration): v200 task promotion must resolve inherited domains

queryDomainsForEntity did a raw entity_relationship lookup for
"domain --HAS--> entity" rows. For entities that inherit their domain from a
parent (e.g. glossary terms inheriting from their parent glossary), no direct
HAS row exists — inheritance is computed at read time by the repository layer
(see GlossaryTermRepository.inheritDomains). The raw SQL silently returned an
empty list, so v200 promoted such tasks into task_entity with no domains in the
JSON and no domain HAS task rows in entity_relationship, breaking the activity
feed for domain-scoped users.

Switch resolveDomainsForTaskAbout to load the entity via EntityRepository.get
with FIELD_DOMAINS, then call ei.getDomains(). That path already handles every
entity type's inheritance rule, matching what v1129 does on the thread_entity
side.

Also fix the alreadyExists branch in migrateThreadTasksToTaskEntity. Force-
migrate previously skipped that branch's domain reconciliation entirely, so
tasks already promoted by a pre-fix v200 run would stay broken even after the
fix shipped. Now the alreadyExists path also resolves and inserts domain HAS
rows; insertTaskDomainRelationships swallows duplicate-key on already-present
rows, keeping the call idempotent.

Removes the now-unused queryDomainsForEntity / buildDomainReference helpers.

* perf(migration): cache resolved domains per (entityType, entityId) in v200

When task migration runs against an install with many tasks pointing at a small
number of target entities (the typical pattern — hundreds of tasks per glossary
term), calling EntityRepository.get for each task re-loads the entity and
re-walks its inheritance chain. For 100K tasks across ~100 unique entities,
that is ~100K full-entity loads vs the ~100 actually required.

Add a per-migration HashMap cache keyed by entityType::entityId. The migration
runs single-threaded on startup so a plain HashMap is sufficient. Transient
lookup failures are not cached so a later task can retry the same entity. The
cache lives for the JVM lifetime but only grows during v200.

Empirical cost per task drops from ~5-20ms (cold repo.get) to ~0.1ms (cache
hit) once the working set is loaded.

* fix(migration): bound v200 domain cache and make cached lists unmodifiable

Per gitar review on PR #28013. The static DOMAIN_CACHE was an unbounded
HashMap. Two defensive improvements:

1. Bound the cache via LinkedHashMap with access-order LRU eviction at 10K
   entries. A pathological install with millions of unique target entities
   (e.g. one task per distinct table) can no longer grow the cache without
   limit and OOM the migration step. Each entry is small (~100 bytes), so the
   cap costs ~1 MB at saturation while still absorbing the realistic working
   set in one pass.

2. Wrap cached lists with Collections.unmodifiableList so that a future
   downstream caller mutating the returned list cannot silently corrupt the
   cache entry for all subsequent lookups of the same entity.

No synchronization needed; v200 runs single-threaded on startup.

* fix(migration): drop ex.deleted = FALSE from task_entity NOT EXISTS check

Per copilot review on PR #28013. With ex.deleted = FALSE in the NOT EXISTS
subquery, SELECT can yield candidate rows that collide on the PK with an
existing soft-deleted row (deleted is not part of (fromId, toId, relation,
relationType)). INSERT IGNORE silently skips the collision, the affected-row
count drops below BATCH_SIZE, and the while loop terminates early — leaving
later candidates unprocessed.

Tasks are hard-deleted only in this codebase, so a soft-deleted domain HAS
task row is not a state we expect to encounter, but the asymmetry between
the SELECT's NOT EXISTS and the PK's collision behavior is a real correctness
bug for any installation that does have such rows. Drop the inner deleted
filter so NOT EXISTS treats any row (active or soft-deleted) as already
present; the SELECT then only yields genuinely-new candidates, and inserted
count accurately reflects remaining work.

Outer er_about.deleted = FALSE and er_domain.deleted = FALSE filters stay,
since we still don't want to propagate soft-deleted MENTIONED_IN or
soft-deleted domain assignments forward into new HAS rows.

Tests flipped from "ex.deleted = FALSE must be present" to "ex.deleted must
be absent" to pin the new contract.

* fix(migration): use List.copyOf for v200 domain cache entries

Per copilot review on PR #28013. Collections.unmodifiableList wraps the
underlying list but does not snapshot it — if a later read of the same
entity through the repository layer mutates the list backing the cached
reference, the cached value silently changes too.

Switch to List.copyOf which produces an independent immutable snapshot,
so cache entries are genuinely stable for the lifetime of the migration.
2026-05-21 18:42:21 +05:30
Harshit Shah
9d1c223c1b
fix(e2e): fix flaky test case export download capture in Playwright (#28337)
Register page.waitForEvent('download') at the top of performTestCaseExport,
before any click actions, to eliminate a race condition.

Test case export always takes the async path: /exportAsync returns a jobId,
the server processes it, then fires a WebSocket COMPLETED event which triggers
downloadFile — a programmatic <a download href="blob:..."> click. For an
empty table the job completes almost instantly. When the WebSocket fires
before Playwright has finished configuring blob-download capture via CDP
(which happens asynchronously after waitForEvent is called), Chromium treats
the blob-URL click as a page navigation instead of a download, closing the
page context and throwing:

  Error: page.waitForEvent: Target page, context or browser has been closed

Moving the listener to the very top of the function gives Playwright the full
duration of the subsequent awaits (visibility checks, form wait, button state)
to complete its CDP setup — eliminating the race.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 14:59:21 +02:00
IceS2
f2d4a577a6
fix(snowflake): log discovered databases and filter-out reasons (#28336)
* fix(snowflake): log discovered databases and filter-out reasons

SHOW DATABASES only returns databases the ingestion role can see, so a
missing database could mean either a privilege/share gap on the role or
an exclusion by databaseFilterPattern. The logs distinguished neither:
the raw result was never logged, and filtered databases were collected
into the status summary without a live log line.

Log the SHOW DATABASES result (count + names) at INFO, and log each
filtered-out database at INFO with the value matched and whether FQN
filtering was used. This makes "we see fewer databases than expected"
diagnosable from the ingestion logs alone.

* fix(snowflake): %-style logging, narrow filter_name type, list at DEBUG

- Convert f-string log calls to %-style so the formatting cost is paid
  only when the level is enabled (no f-string interpolation on suppressed
  DEBUG / large-list payloads).
- Narrow filter_name to str: use new_database when useFqnForFiltering is
  on but fqn.build() returned None, so filter_by_database (str-typed
  parameter) stops getting str | None. Fixes the basedpyright error
  surfaced by extracting the inline expression.
- Move the full SHOW DATABASES name list to DEBUG; INFO keeps the count.
  Addresses Copilot's note about INFO log volume on accounts with many
  databases.
2026-05-21 14:55:02 +02:00
IceS2
125b73b2a9
fix(powerbi): flush sink buffer before lineage resolution (#28308)
* fix(powerbi): flush sink buffer before lineage resolution

Adds a Barrier sentinel record that, when yielded from a source,
triggers a synchronous flush of MetadataRestSink's bulk buffer.
PowerBI's yield_dashboard_lineage override yields a Barrier before
delegating to super(), so that target-entity lookups via get_by_name
resolve against committed entities instead of returning None for
items still in the sink's bulk buffer.

Effect: intra-workspace and backward-cross-workspace lineage is
captured on the first ingestion run instead of requiring a re-run.
Forward cross-workspace lineage (target in a workspace not yet
scanned in the current run) remains a separate concern.

* chore(powerbi): satisfy ruff + basedpyright on barrier changes

- Use PEP 604 `str | None` in the new Barrier dataclass (UP045).
- Add explicit strict= to zip() in the new test (B905).
- write_barrier returns Either[Entity] to match _flush_buffer.
- Suppress the basedpyright Either(...) reportCallIssue false positive
  (same pattern baselined ~1700x) and the rule-less generator-return
  variance on the defensive Barrier register.
- ruff format the touched test files.
2026-05-21 14:54:35 +02:00
IceS2
a06f64b4ce
fix(ingestion): cap kubernetes client below 36.0.0 (#28331)
kubernetes==36.0.0 regressed in-cluster authentication, breaking the
KubernetesSecretsManager in the hybrid runner.

Mechanism:
- Configuration.auth_settings() in v36 looks up the bearer token under
  api_key['BearerToken'], but load_incluster_config() / load_kube_config()
  still write it to api_key['authorization']. The mismatch means no
  Authorization header is sent — the API treats the request as
  system:anonymous and returns 403 Forbidden when reading the secret.
  The caller surfaces this as "password authentication failed."

Proof it's the client, not env/RBAC:
- curl with the same mounted SA token returns 200.
- kubernetes 35.0.0 works; 36.0.0 doesn't.

Upstream is open and unfixed:
- https://github.com/kubernetes-client/python/issues/2582
- https://github.com/kubernetes-client/python/issues/2584

The previous unbounded `>=21.0.0` pin caused the post-2026-05-19 image
build to pull 36.0.0. Capping to <36 keeps us on the working 35.x line
and guards against further 36.x regressions until upstream ships a
patch — at which point this becomes `!=36.0.0` or a fixed `>=36.x`.
2026-05-21 14:45:26 +02:00
Chirag Madlani
e77a27de7b
feat(ui): add ascending and descending icons, update SortingDropDown … (#28297)
* feat(ui): add ascending and descending icons, update SortingDropDown component, and localize tool label

* fix tests

* fix styling

* fix icon

* fix styling

* fix failing tests

* fix playwright tests

* fix tests
address copilot comments

* revert DatabaseClass changes
fix advanceSearch label issue

* fix tests
2026-05-21 18:13:02 +05:30
Harshit Shah
06d0b48a16
fix(e2e): remove stacked test.slow() from Pipeline Alert and lazy-init entities in beforeAll (#28333)
`Pipeline Alert` called `test.slow()` inside the test body. Combined with
any outer timeout multiplier this produced a 9-minute effective timeout
instead of the expected ~3 minutes. Removed the redundant call.

Entity declarations (`table1`, `table2`, `pipeline`, `domain`) were
module-level `const` constructed at import time. This caused entity
IDs to be frozen before `beforeAll` ran, leading to stale or empty
`fullyQualifiedName` values when observability creation details were
built. Moved declarations to `let` and initialised them inside `beforeAll`
so they always reflect the actual API response.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 12:29:42 +00:00
Harshit Shah
6480f46ead
fix(data-quality): add search to column selection dropdowns in test case form (#28323)
Add `showSearch` to the two column `Select` dropdowns in `ParameterForm`
that previously had no search capability:
- Partition column select (tableRowInsertedCountToBeBetween / columnName)
- Generic column select (data.name === 'column')

Fixes #28303
2026-05-21 12:24:40 +00:00
Mohit Yadav
dfc51c57ab
Add Tests field back (#28324)
* Add Tests field back

* test(dq): address review feedback on TestSuiteListAfterReindex spec

Trim the list query to the params that drive the exists(tests) filter
and assert the exact 200 status from the async reindexEntities endpoint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 14:16:34 +02:00
Akash Verma
14412e411a
fix(security): upgrade Apache Airflow to 3.2.1 (#28101)
* fix(security): upgrade Apache Airflow to 3.2.1 and Flask to 3.1.3 to resolve CVEs

* Fix: Gitar bot comments and failing dependency requirement

* Fix: Failing tests , pycheckstyle and gitarcomment

* Fix: Remove changes not needed after rebasing with main

* Fix: Airflow-api-tests failing due to 'Can't append to data files in parallel mode.'

---------

Co-authored-by: Akash Verma <akashverma@Akashs-MacBook-Pro-2.local>
Co-authored-by: IceS2 <pablo.takara@getcollate.io>
2026-05-21 17:18:55 +05:30
Sid
533813d84f
fix(playwright): drive SSO refresh test via real nav, not no-op click (#28322)
Test #8 ("should queue concurrent 401s behind a single refresh call") was
passing for the wrong reason. The click on app-bar-item-explore was a
no-op (NavLink to the same route), but the document.body click listener
in AppContainer.tsx was firing analytics.track on every click anywhere —
that PUT to /api/v1/analytics/web/events/collect 401'd, was caught by
the in-app axios interceptor, refreshed, and the test happened to see a
200 on /api/v1/auth/refresh.

PR #28232 removed that global click listener (it was dead-ended analytics
surface area). The hidden trigger is gone, so test #8 has no actual
authenticated request to 401 — waitForResponse for /refresh times out.

Fix: navigate the user away from /explore (back to /my-data) before the
expiry wait, so the subsequent app-bar-item-explore click is a real
route change that fires the page's API calls and 401s through the
in-app refresh path the test name promises.

Co-authored-by: Siddhant <siddhant@MacBook-Pro-751.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:55:01 +00:00
Sriharsha Chintalapani
7042153e32
feat(context-center): search indexing + vector body text for memories/pages (#28314)
* feat(context-center): enable search indexing + vector body text for memories/pages

ContextMemory was indexable in the schema layer but supportsSearch was false,
so live indexing and bulk reindex did not include it. Vector embeddings for
ContextMemory and Page fell through to the default description-only body text
extractor, which produced near-empty embeddings since the actual content lives
in title/question/answer (ContextMemory) and displayName/page payload (Page).

Changes:
- Add ES/OS index mapping for context_memory_search_index across en/ru/zh/jp
- Register contextMemory in indexMapping.json with parentAliases=[all]
- ContextMemoryIndex (TaggableIndex) flattens shareConfig into visibility +
  sharedWithIds, normalizes source UUIDs, and populates entity refs with
  display names
- Wire SearchIndexFactory.buildIndex() + flip ContextMemoryRepository
  supportsSearch=true so create/update/delete fire live indexing
- Flip supportsSearchIndex=true in ContextMemoryIT to inherit BaseEntityIT's
  4 search-index tests
- ContextMemoryBodyTextContributor concatenates title/summary/question/answer/
  description for the vector embedding instead of just description
- PageBodyTextContributor adds title (displayName) and, for QuickLink pages,
  the destination URL alongside the markdown description
- Register both contributors via static initializers in their owning
  EntityRepositories, per the VectorBodyTextContributor convention

Tests: 25 new unit tests across ContextMemoryIndexTest (10),
ContextMemoryBodyTextContributorTest (6), PageBodyTextContributorTest (9).
All passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(context-center): address Copilot review feedback on indexing PR

- PageBodyTextContributor: fall back to page.getName() when displayName is
  null/blank so vectors always have a title (matches the convention in
  SearchIndex.populateCommonFields)
- PageBodyTextContributor: log the exception object (not e.getMessage()) so
  the stack trace is available when debug logging is on
- ContextMemoryIndex: null-guard each principal entry in shareConfig.sharedWith
  before dereferencing, so a malformed payload cannot NPE the indexer

Added 2 tests covering both behaviors; existing tests adjusted for the new
title-fallback default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 12:41:02 +02:00
Harshit Shah
82cdd6b613
Remove unused UI files (#28325) 2026-05-21 15:58:52 +05:30
Harsh Vador
2b66a54bb4
Fix ws Dependabot vulnerability in UI (#28320) 2026-05-21 15:42:11 +05:30
Harshit Shah
425e2c6f1b
fix(e2e): wait for Resolved incident to be ES-indexed before DQ tab assertion (#28326)
* fix(e2e): wait for Resolved incident to be indexed before DQ tab assertion

After posting the Resolved status for the uniqueness test case, the test
immediately navigated to the DQ tab and asserted the incident status
was "Resolved". Because there was no ES indexing wait after the status
transition, the assertion raced against the indexer and saw "New".

Extended `waitForIncidentToBeIndexed` with an optional `expectedStatus`
param so callers can block until a specific resolution status appears in
the API. Used it in the beforeAll hook right after the POST to Resolved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix checkstyle

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 09:36:04 +00:00
Harshit Shah
1335ce80e6
fix(kpi-widget): add space between day and month in X-axis date format (#28318)
Change date format from 'dMMM, yy' to 'd MMM, yy' so axis ticks
render as '30 Apr, 26' instead of '30Apr, 26'.

Fixes open-metadata/openmetadata-collate#4184

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 08:13:22 +00:00
Mohit Yadav
04e8ed1a1f
Improve Vector Embedding recalculation logic (#28313)
Some checks are pending
Integration Tests - MySQL + Elasticsearch / integration-tests-mysql-elasticsearch (push) Blocked by required conditions
Integration Tests - MySQL + Elasticsearch / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + Elasticsearch + Redis / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + Elasticsearch + Redis / integration-tests-postgres-elasticsearch-redis (push) Blocked by required conditions
Integration Tests - PostgreSQL + OpenSearch / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + OpenSearch / integration-tests-postgres-opensearch (push) Blocked by required conditions
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
Publish Package to Maven Central Repository / publish-maven-packages (push) Waiting to run
* Remove Vector Embedding recalculation logic

* Add test

* Address Review Comments

* Address comment

* ADdress review comments

* Fix VectorEmbeddingIntegartionIT

* More review addressed
2026-05-21 08:26:34 +02:00
Karan Hotchandani
8e619b8485
feat(ui-core): add xs button size and link/anchor Storybook stories (#28257)
* feat(ui-core): add xs button size and link/anchor stories

- Add `xs` size variant to Button with `text-xs`, `px-2 py-1`, `gap-0.5`, `rounded-md` tokens
- Update Sizes and IconOnly stories to include xs
- Add LinkColorWithTrailingIcon story (link-color + trailing icon)
- Add AsLink story covering primary/secondary/tertiary buttons rendered as anchor tags, with icon, and disabled state

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ui-core): fix xs button icon sizing and add xs size stories

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style(ui-core): format AsLink story props for readability

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 11:50:40 +05:30
Anujkumar Yadav
aee2c69af3
Improvement: Mover alert bar to drawer for success and error message (#28312)
* Improvement: Mover alert bar to drawer for success and error message

* fix lint issue
2026-05-21 10:23:48 +05:30
Sriharsha Chintalapani
fd8d66c899
perf(glossary): batch related-term hydration via /glossaryTerms/byIds (fix N+1) (#28279)
* perf(glossary): batch related-term hydration via new /glossaryTerms/byIds endpoint

Sentry on release-1-13 flagged an N+1 on the Glossary Term Relations Graph
tab (transaction /glossary/.../relations_graph, p95 ~1.4s) — eight
sequential GET /api/v1/glossaryTerms/{id}?fields=relatedTerms,children,
parent,owners calls at ~180ms each. Two recursive resolution loops in
useOntologyExplorer.ts (loadNextTermPage:658-683 and fetchGraphDataFromDatabase:822-847)
fan out per-Id getGlossaryTermsById calls to hydrate cross-glossary
related terms after the initial paginated load, recursing up to 5 levels
deep. The customer hit a depth-1 cascade that produced ~8+ HTTP round
trips for a single page visit.

Adds GET /v1/glossaryTerms/byIds?ids=u1,u2,...&fields=... that returns a
single hydrated List<GlossaryTerm>, capped at 200 ids per request to
stay well under URL length limits and to isolate a single bad Id to one
batch. Missing/deleted/unauthorized Ids are silently dropped, matching
the old Promise.allSettled semantics so callers don't need to change
their error handling. Both resolution loops now call the batch endpoint
once per BATCH_SIZE (100) chunk instead of fanning out per-Id; depth-1
goes from 8 round trips to 1.

Tests: backend IT covers happy-path, fields hydration, silent-skip of
missing ids, and empty-input semantics. Playwright spec opens the
Relations Graph tab on a term with a cross-glossary relation and
asserts zero per-Id /glossaryTerms/{id}?fields=relatedTerms... requests
fire — failing if anyone re-introduces the resolution N+1. The new
batch endpoint is asserted to be called at least once, evidencing the
new path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(glossary): address review feedback on byIds endpoint and N+1 fix

Reviewers (gitar-bot + Copilot) raised 7 findings across the backend, the
client hook, and the Playwright spec. Addressed all:

Backend (GlossaryTermResource.byIds):
- Catch AuthorizationException alongside EntityNotFoundException so a
  single unauthorized id can't 403 the whole batch — matches the
  documented "silently omit missing/unauthorized" contract.
- Switched the OpenAPI response schema from a single GlossaryTerm to
  @ArraySchema so generated clients see the correct array shape.
- Dropped `required = true` on the `ids` query param: the implementation
  tolerates blank/missing (returns []) and the IT pins that behavior, so
  the spec was lying. Description now states the contract explicitly.

Backend tests:
- Added the two negative tests the PR description claimed but the file
  was missing: malformed UUID -> 400 and >200 ids -> 400.

Frontend (useOntologyExplorer resolution loops):
- On a whole-batch failure, set `aborted = true`, break out of the
  current chunk loop, and clear missingIds before the next pass. Before
  this, the same failing batch was silently retried up to
  MAX_RESOLUTION_DEPTH - 1 more times for no benefit.

Playwright (GlossaryRelationsGraphPerf):
- Authenticate the new page via AdminClass.login(page) before the
  request listener attaches; previously `browser.newPage()` + a
  separate `performAdminLogin(browser)` left the test page unauth'd, so
  the request listener never saw the API calls and the spec hung on
  the wait-for-response timeout.
- Fixed the `relatedTerms` JSON-Patch shape (it stores TermRelation
  objects, not bare EntityReferences). With the old shape the relation
  never landed, the resolution loop never fired, byIds was never
  called, and the spec hit a wait-for-response timeout (the 1.2-minute
  retries observed in CI).
- Replaced the dual `/rdf/glossary/graph` OR `/glossaryTerms/byIds`
  signal with byIds-only: for `scope === 'term'` the rdf graph endpoint
  isn't called even when rdfEnabled, so listening for it just added
  flake. Bumped the timeout to 60s for cold-CI runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(glossary): IT failures on byIds — wrong JSON path + URL header limit

Two integration tests added in the previous commit failed in CI:

1) getGlossaryTermsByIds_honorsFieldsParam_hydratesRelatedTerms
   The assertion read `relatedTerms.get(0).path("id")` but glossary term
   relatedTerms is a List<TermRelation> serialized as
   {relationType, term: {id, ...}} — the id is nested under `term`, not
   at the top of the array element. Fixed to `.path("term").path("id")`.

2) getGlossaryTermsByIds_tooManyIds_returns400
   Sending 201 UUIDs in a query string puts the URL at ~7.5 KB which
   trips Jetty's 8 KB request-header limit; the request was rejected
   with 431 Request Header Fields Too Large before reaching the
   server-side cap check, so the test never saw the documented 400.

   Two-part fix:
     - Lower MAX_BATCH_BY_IDS from 200 to 100. 100 * 37 chars per UUID +
       separators is ~3.7 KB, well below 8 KB. This also matches the
       client's BATCH_SIZE in useOntologyExplorer.ts (so the client can
       now use the whole window without hitting the cap defensively).
     - Test uses 101 ids (still tiny URL) so the cap check actually
       fires and returns the documented 400.
   Updated Javadoc and the client-side BATCH_SIZE comment to reflect
   the new alignment.

The python failure on test_validations_datalake.py reproduces across
3.10/3.11/3.12 in the same parameterized case and is unrelated to the
glossary changes in this PR — pre-existing on main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(glossary): address two new review comments on byIds

1) glossaryAPI.ts comment said the batch endpoint supports up to 200 ids
   but the backend cap was lowered to 100 in the previous commit. Updated
   the comment to match (and link the rationale — 100 keeps the URL
   under Jetty's 8 KB header limit).

2) The two 400-response IT tests asserted `contains("400") || contains
   (<substring>)`, which would let a 500 with "invalid" or "too many"
   in the response body silently pass. Tightened both to require BOTH
   the HTTP 400 status AND the expected message substring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(glossary): assert HTTP 400 by exception type, not message substring

The previous tightening required `e.getMessage().contains("400")`, but
the SDK's OpenMetadataHttpClient.handleErrorResponse puts ONLY the
error body in the exception message (no "HTTP 400" prefix in the parsed
path), so the assertion failed on real 400 responses with bodies like
"ids parameter contains an invalid UUID" / "Too many ids: 101 (max 100)".

Use assertThrows(InvalidRequestException.class, ...) instead — the SDK
throws InvalidRequestException ONLY for HTTP 400 (other statuses surface
as ApiException or status-specific subclasses like ForbiddenException),
so the type assertion locks the status code as strongly as a body
substring check would. Substring check stays for the body content.

Removes the no-longer-used `Assertions.fail` static import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(glossary): address four review comments on byIds

Backend (GlossaryTermResource.java):
- Add includeRelations query param for parity with GET /{id} so callers
  can pass per-relation include controls (owners:non-deleted, etc.)
  through the batch path. Forwards to the existing 6-arg getInternal.
- Add a catch (RuntimeException) per-id so unexpected failures
  (validation, downstream 5xx surfaced as WebApplicationException, etc.)
  don't fail the whole batch. EntityNotFoundException /
  AuthorizationException stay on debug; the broader catch logs at warn
  so a real bug isn't silently swallowed.

Frontend (useOntologyExplorer.ts):
- Extract the duplicated related-term resolution loop into a top-level
  `resolveRelatedTerms(terms)` helper. Both call sites
  (loadNextTermPage, fetchGraphDataFromDatabase) now do
  `await resolveRelatedTerms(...)`, ~100 fewer LoC and no risk of the
  two implementations drifting.
- Change the failure semantics from "abort the whole resolution on
  first batch failure" to "remember the failed Ids in a skip set, keep
  going". This restores best-effort hydration (matching the old
  Promise.allSettled behavior on the client) without falling back into
  the gitar-bot-flagged retry-the-same-batch-MAX_DEPTH-times footgun:
  the skip set causes collectMissingRelatedTermIds to never hand the
  same Ids back on subsequent depth passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(glossary): pin cross-glossary relation surfacing in graph + list paths

User reported the glossary graph endpoint returning partial data (focused
term + a sibling, both marked glossaryTermIsolated, edges = []) when the
focused term has cross-glossary related terms. Adds two ITs that
exercise the exact scenarios.

1) GlossaryTermResourceIT#listGlossaryTerms_hydratesCrossGlossaryRelatedTerms
   Hits GET /v1/glossaryTerms?glossary=<A>&fields=relatedTerms with a
   focused term in glossary A pointing at a related term in glossary B.
   Asserts BOTH the single-entity GET and the bulk-list endpoint return
   the cross-glossary relation — pins the bulk hydration path
   (GlossaryTermRepository.setFieldsInBulk -> fetchAndSetRelatedTerms)
   against the single-entity path (setFields -> getRelatedTerms) as
   producing the same shape.

2) RdfGlossaryGraphIT#crossGlossaryRelationSurfacesAsEdgeAndNodeInScopedGraph
   Creates focused (in glossary A) + relatedAcross (in glossary B),
   adds the relation, hits GET /v1/rdf/glossary/graph?glossaryId=<A>,
   asserts the response contains: focused NOT marked
   glossaryTermIsolated, relatedAcross as a secondary node, edge
   between them.

Both tests pass against current main on a clean DB, which means the
user's reported failure mode does not reproduce from a freshly seeded
fixture. Likely causes:
  - Stale RDF data in the user's deployment (Fuseki has the terms but
    is missing the relation triples; SPARQL returns nodes but 0 edges
    and the nodes.isEmpty() fallback in RdfRepository.java:1618
    doesn't fire because nodes ARE present)
  - The deployment may be on a code version that predates the canonical
    relation storage fix from PR #25886
  - User-specific relation type / glossary structure not captured here

The tests now stand as regression guards: any future change that
breaks cross-glossary surfacing on either path will fail CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Revert "test(glossary): pin cross-glossary relation surfacing in graph + list paths"

This reverts commit 460459b00c.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 17:59:19 -07:00
Sriharsha Chintalapani
09af9fc801
Fixes #4003: bulk + async restore for large entity hierarchies (#27997)
* fix(restore): bulk + async restore for large entity hierarchies

EntityRepository.restoreEntity walked descendants synchronously, taking
4+ minutes on a 12k-table database and exceeding typical proxy timeouts.
restoreChildren now groups CONTAINS children by type and dispatches one
bulkRestoreSubtree per type, batching DB writes, version history,
change events, and cache invalidation; the existing ES cascade handles
descendant index updates in one update_by_query.

Adds an async option (?async=true) on the deep-hierarchy restore
endpoints that returns 202 Accepted with a job id and runs the restore
on AsyncService, emitting WebSocket notifications on
restoreEntityChannel. Java SDK adds .restore().async().execute() fluent
builders on Tables/Databases plus restoreServerAsync on
EntityServiceBase; Python SDK mirrors this with
restore_request().with_async().execute() and restore_async() helpers
on BaseEntity, exposing a new AsyncJobResponse type.

Tests: EntityRepositoryRestoreTest verifies the per-type grouping and
bulk dispatch path; RestoreFluentAPITest covers the Java SDK fluent
behavior; RestoreHierarchyIT exercises sync and async restore against a
real DB→schemas→tables tree end-to-end; test_restore_async.py covers
the Python SDK paths.

Fixes #4003
2026-05-20 17:57:40 -07:00
Sriharsha Chintalapani
e41544764b
ingestion: runtime diagnostics subsystem (#28161)
Some checks are pending
Integration Tests - MySQL + Elasticsearch / Detect Changes (push) Waiting to run
Integration Tests - MySQL + Elasticsearch / integration-tests-mysql-elasticsearch (push) Blocked by required conditions
Integration Tests - PostgreSQL + Elasticsearch + Redis / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + Elasticsearch + Redis / integration-tests-postgres-elasticsearch-redis (push) Blocked by required conditions
Integration Tests - PostgreSQL + OpenSearch / Detect Changes (push) Waiting to run
Integration Tests - PostgreSQL + OpenSearch / integration-tests-postgres-opensearch (push) Blocked by required conditions
Java Checkstyle / java-checkstyle (push) Waiting to run
Maven Collate Tests / maven-collate-ci (push) Waiting to run
OpenMetadata Service Unit Tests / Detect Changes (push) Waiting to run
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / k8s_operator-unit-tests (push) Blocked by required conditions
OpenMetadata Service Unit Tests / openmetadata-service-unit-tests-status (push) Blocked by required conditions
Publish Package to Maven Central Repository / publish-maven-packages (push) Waiting to run
* docs(ingestion): design for runtime diagnostics subsystem

Proposal for an always-available, opt-in (loggerLevel=DEBUG) diagnostics
layer inside the ingestion framework so connector runs that hang, OOM, or
slow down produce enough live evidence to identify the root cause in
`kubectl logs` — without `py-spy`, `kubectl debug`, or ptrace.

Grounded in three concrete production cases:
- The Snowflake "hang" that was actually a logging recursion bug in
  StreamableLogHandler (fixed by PR #28160) but took ~6 hours and one
  wrong-theory fix to identify.
- Recurring OOMKills with no last-state evidence and no way to attribute
  growth to a specific object type or stage.
- "Is it stuck or just slow?" with no way to answer from outside the pod.

The design is gated entirely on the existing `workflowConfig.loggerLevel`
(no new env vars, no new config fields). When off, the module is dead
code. When on (~250 KB / <0.01% CPU), it provides:
- An operation registry of "what each thread is doing right now"
- SIGUSR1 / SIGUSR2 handlers for on-demand dumps to stderr
- A watchdog thread that auto-logs hangs at 60s and auto-dumps at 300s
- A heartbeat thread emitting one structured progress line every 30s
- A memory tracker (RSS / cgroup / GC top-types on dump)
- Stage-backpressure visibility (queue depths between source/processor/sink)
- HTTP introspection of OMetaClient and DB cursor execute()
2026-05-20 16:38:09 -07:00
Anujkumar Yadav
6687fd4c3e
Refactor: reflect entity edits on graph nodes without full reload (#28306)
* Refactor: reflect entity edits on graph nodes without full reload

* fix lint changes

* nit

* nit

* fix minor suggestion
2026-05-20 18:06:06 +00:00