mirror of
https://github.com/open-metadata/OpenMetadata
synced 2026-05-24 09:39:11 +00:00
2 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
7693a5b04b
|
Update indexing schedule (#27204)
* Update schedule to weekly * Migration |
||
|
|
b9d8c08b5b
|
Refactor(certification): store asset certification in tag_usage table (#26448)
* refactor(certification): store asset certification in tag_usage table Previously, asset certification was stored as a JSON blob directly on the entity row. This created a split system where the tag FQN lived in the entity JSON while tag metadata (name, description, style) had to be re-fetched from the tag table on every read. It also meant certification was invisible to the tag_usage propagation pipeline, so renaming a certification tag's FQN left stale data on certified entities. Certification is now stored in tag_usage alongside all other tags, using the metadata column to carry expiryDate (added to TagLabelMetadata schema). The entity's certification field remains the input/output surface, but tag_usage is now the source of truth. Key changes: Storage & retrieval - applyCertification() writes the certification tag into tag_usage on store - deleteCertificationTag() removes it from tag_usage on clear/replace - getCertification() reads from tag_usage filtered by the configured certification classification instead of parsing entity JSON - getTags() now strips certification-classification tags so they are surfaced exclusively through getCertification() Performance improvements - batchFetchCertification() rewritten to a single batch query on tag_usage by FQN hash instead of performing N individual tag lookups Tag update handling - handleTagEntityUpdate() reads the allowed classification from settings (no longer hardcoded) - correctly computes oldFQN on name change so Elasticsearch documents are found and updated using the correct key DAO & schema changes - deleteTagsByPrefixAndTarget() added to CollectionDAO for targeted certification tag removal - TagLabel mappers hardened against unknown metadata fields Migrations - v1123 migrations backfill existing entity JSON certifications into tag_usage so no data is lost during upgrade Tests - TagResourceIT updated to assert getCertification() instead of getTags(), since certification tags are intentionally excluded from the tags list * Update generated TypeScript types * chore: apply changes Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com> * fix(certification): prevent updateTags() from clobbering cert tags written by updateCertification() * fix(certification): compute tagFQNHash per-segment in Java during migration and make applyCertification idempotent * Update generated TypeScript types * Fix: SQL-filtered cert batch fetch, remove double-delete, schema strict mode, ordinal bounds check, migration logging * Update generated TypeScript types * Fix Migration * Fix Migration * fix(certification): address Copilot review feedback on PR #26448 - Use exact field name comparison (FIELD_NAME.equals) instead of contains() in SearchRepository to avoid incorrect FQN-rename branch triggers when displayName changes - Log previously swallowed exception in getCertificationClassificationFromSettings() to improve observability of certification search propagation failures - Fix v1124 migration by building selectedIds inside the insert loop and skipping rows with null tagFQN, preventing UPDATE from removing certifications without corresponding tag_usage entries (avoids silent data loss) - Update integration test to rename tag name (not displayName) so it correctly validates the FQN-change regression from #26432 and asserts propagation to entity certification field and search index * fix(migration): fix v1124 certification migration correctness issues - Fix wrong version string in error messages: both mysql and postgres Migration.java logged "v1123" instead of "v1124" - Fix potential infinite loop: null-tagFQN rows were excluded from the INSERT but still counted in the return value (rows.size()), so when a full batch of 500 rows all had null tagFQN the loop never terminated. Fix by filtering null tagFQN at SQL level (WHERE tagFQN IS NOT NULL) and returning selectedIds.size() so the loop count reflects rows that were actually migrated Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(certification): fix missing tables in migration and optimize getCertification query - Add 6 missing entity tables to v1124 certification migration: file_entity, directory_entity, spreadsheet_entity, worksheet_entity, llm_model_entity, ai_application_entity — all define the certification field in their JSON schema; omitting them caused silent data loss on upgrade (certification stripped from JSON but never written to tag_usage) - Replace getCertification() full-tag-fetch with getCertTagsInternalBatch() so single-entity reads issue a targeted WHERE tagFQN LIKE query instead of fetching all tags and filtering in Java (consistent with the bulk path) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(certification): preserve appliedDate in migration and avoid appliedAt reset on unchanged cert - v1124 migration now extracts certification.appliedDate from entity JSON and inserts it as tag_usage.appliedAt, preserving the original certification timestamp instead of defaulting to migration time - applyCertification() now checks whether the existing certification tag matches the incoming one before doing delete+reinsert; if unchanged it returns early, preventing appliedAt from being reset on every entity write Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(certification): also compare expiryDate in applyCertification idempotency check The previous fix skipped delete+reinsert when tagFQN was unchanged, but this incorrectly swallowed expiryDate updates — re-certifying with the same tag but a new validity period would return early and never write the new expiryDate to tag_usage. Adding Objects.equals(expiryDate) to the guard ensures metadata-only changes are still persisted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(certification): replace fixed sleeps with Awaitility polling in rename test Fixed sleeps are flaky under CI load and always waste time when indexing is faster. Replace both TimeUnit.SECONDS.sleep(2) calls and all subsequent search/entity assertions with Awaitility.await().untilAsserted() blocks (30s timeout, 1s poll interval) so the test waits exactly as long as needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(migration): include exception in certification migration warning log Pass the exception object to LOG.warn so the stack trace is available for diagnosing production migration failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * perf: cache getCertificationClassification() via SettingsCache Replace direct SystemRepository DB call with SettingsCache.getSettingOrDefault() (Guava LoadingCache, 3-min TTL) to eliminate repeated DB hits on every certification-related call in EntityRepository. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * skip the test * Added new column for certification and tier * nit * Add test for tier and certification * fix unit test * Fix Unit tests * Move Migrations to 1.12.5 and unit tests * Fix NPE, batch certification writes, and improve test coverage - Guard against null tagLabel in applyCertification to prevent NPE on malformed input - Replace per-entity applyCertification loop in storeRelationshipsInternal with applyCertificationBatch, reducing 3N DB calls to 2 (one batch DELETE + one batch INSERT via existing applyTagsBatchMultiTarget) - Add deleteTagsByPrefixAndTargets to TagUsageDAO as the batch variant of deleteTagsByPrefixAndTarget - Add tests for applyCertificationBatch paths, getTags cert filtering, and TagLabelWithFQNHash.toTagLabel to meet 90% new-code coverage threshold Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add coverage tests for RowMappers, batchFetchCertification, and toTagLabel fallbacks - Add TagLabelMapper and TagLabelWithFQNHashMapper tests using mock ResultSet to cover the new metadata-parsing code paths in CollectionDAO - Add toTagLabel fallback tests for out-of-bounds enum ordinals covering the defensive conversion logic in TagLabelWithFQNHash - Add storeRelationshipsInternal single-entity overload test covering line 2322 - Add fetchAndSetFields tests to cover batchFetchCertification happy path and exception fallback path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * resolved the linting issue * nit * fix lint issue --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Gitar <noreply@gitar.ai> Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Anujkumar Yadav <anujf0510@gmail.com> |