OpenMetadata/bootstrap/sql/migrations/native/1.13.0/postgres/postDataMigrationSQLScript.sql

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

35 lines
4.8 KiB
MySQL
Raw Normal View History

UPDATE ingestion_pipeline_entity
SET json = (json::jsonb #- '{sourceConfig,config,computeMetrics}')::json
WHERE json::jsonb -> 'sourceConfig' -> 'config' -> 'computeMetrics' IS NOT NULL
AND pipelineType = 'profiler';
Glossary relations (#25886) * Glossary Term Relations * Add GlossaryTerm Relations * Add GlossaryTerm Relations, Add custom relations, onotolgoy explorer * Add Translations * Update generated TypeScript types * Address comments * Address comments * Address comments * Update generated TypeScript types * Update yarn.lock after merging cytoscape dependencies from glossary_relations * fix zoom in and out functionality and added missing translate keys * fix test * Remove unwanted changes * nit * nit * nit * Remove conflict test * nit * fix test * Add test for ontology explorer * New yarn lock and 2.0.0 schema changes missed during merge conflicts * Revamped glossary term relation settings * Refactor code * Addressed comments * nit * Update generated TypeScript types * Java Checkstyle and Yarn lock * Update generated TypeScript types * fix unit test * Remove 2.0.0 migration folders placed at wrong loc * Merge main * fix navigation to relation graph in glossary * fix ontology explorer spec * Added filter support in the data mode * Fix glossary term relation CI failures ### Canonical Relation Storage (GlossaryTermRepository) * Introduced `computeCanonicalRelationType()` to normalize relation direction using UUID ordering (lower UUID is always treated as "from") * Prevents duplicate and inconsistent relation rows when created from either side * Updated `setTermRelations()` and `addRelation()` to store canonical relation types * Fixed `setFields()` read logic: * Invert relation type for `fromRecords` (entity is the TO side) * Keep `toRecords` unchanged * Updated `deleteBidirectionalRelatedTo()` to match canonical storage format * Added `RequestEntityCache.invalidate()` after relation mutations to ensure consistency ### Lazy RDF Resource Initialization * Added `RdfRepository.getInstanceOrNull()` for null-safe access without throwing * Refactored `RdfResource` constructor to avoid eager `RdfRepository.getInstance()` call * Enabled resource registration even when Fuseki is not initialized * Introduced lazy getters: * `getRdfRepository()` * `getSemanticSearchEngine()` * Updated all endpoints to guard with null checks before `isEnabled()` * Return `503 Service Unavailable` when RDF is not ready ### Graceful Test Degradation (Fuseki-dependent tests) * Added `TestSuiteBootstrap.isFusekiEnabled()` to detect Fuseki availability * `GlossaryOntologyExportIT`: * Falls back to Testcontainers-based local Fuseki when bootstrap Fuseki is unavailable * `GlossaryTermRelationIT`: * Skipped via `assumeTrue` when Fuseki is unavailable * `MetricResourceIT`: * Skips RDF-specific tests when Fuseki is unavailable * fix package conflicts * nit * Fix merge conflicts, Python test, RDF reliability, and VectorDocBuilder tests - Fix Python test_patch_glossary_term_related_terms to use TermRelation instead of EntityReferenceList (schema changed relatedTerms type) - Rewrite VectorDocBuilder tests for current buildEmbeddingFields API - Improve JenaFusekiStorage retry logic to retry on all HTTP errors - Increase Fuseki tmpfs size to prevent disk space exhaustion in tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix pycheck * Address all 8 PR review findings 1. Add authorization check on getTermRelationGraph endpoint 2. Add null guard on getBaseUri() to prevent NPE 3. Add React key prop on RelatedTermTagButton in map renders 4. Mark RdfResource lazy-init fields as volatile for thread safety 5. Replace exception messages with generic errors in API responses 6. Unify DEFAULT_RELATION_TYPES between CSV and repository (10 types) 7. Add jitter backoff to deadlock retry in CollectionDAO 8. Replace N+1 queries in prefetchGraphTerms with batch fetch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix Fuseki tmpfs exhaustion and GlossaryTermRelationIT double init - Remove tmpfs size limit on Fuseki container to prevent disk exhaustion - Guard RdfUpdater.initialize() in GlossaryTermRelationIT to skip if already initialized by bootstrap Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix duplicate edges, null term NPE, and silent exception in graph builder - Deduplicate edges in buildGraph() using edgesSeen set - Skip TermRelation entries with null term references to prevent NPE - Add warning log when glossary term relation settings fail to load Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix cardinality count after canonical swap and double-checked locking - getRelationCount now matches inverse relation type for fromRecords where the term is the target, fixing cardinality bypass after bidirectional UUID canonicalization - Use double-checked locking in RdfResource.getSemanticSearchEngine() to prevent duplicate instance creation under concurrency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: anuj-kumary <anujf0510@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com> Co-authored-by: Ram Narayan Balaji <81347100+yan-3005@users.noreply.github.com>
2026-03-18 05:21:03 +00:00
-- Migrate existing glossary term RELATED_TO relationships to include relationType
-- For backward compatibility, existing relations without a relationType are set to "relatedTo"
UPDATE entity_relationship
SET json = jsonb_set(COALESCE(json::jsonb, '{}'::jsonb), '{relationType}', '"relatedTo"')
WHERE fromentity = 'glossaryTerm'
AND toentity = 'glossaryTerm'
AND relation = 15
AND (json IS NULL OR json::jsonb->>'relationType' IS NULL);
-- Insert default glossary term relation settings if they don't exist
-- This preserves any existing user customizations
INSERT INTO openmetadata_settings (configtype, json)
SELECT 'glossaryTermRelationSettings', '{"relationTypes":[{"name":"relatedTo","displayName":"Related To","description":"General association between terms that are conceptually connected.","rdfPredicate":"https://open-metadata.org/ontology/relatedTo","isSymmetric":true,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"associative","isSystemDefined":true,"color":"#1890ff"},{"name":"synonym","displayName":"Synonym","description":"Terms that have the same meaning and can be used interchangeably.","rdfPredicate":"http://www.w3.org/2004/02/skos/core#exactMatch","isSymmetric":true,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"equivalence","isSystemDefined":true,"color":"#722ed1"},{"name":"antonym","displayName":"Antonym","description":"Terms that have opposite meanings.","rdfPredicate":"https://open-metadata.org/ontology/antonym","isSymmetric":true,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"associative","isSystemDefined":true,"color":"#f5222d"},{"name":"broader","displayName":"Broader","description":"A more general term (hypernym).","inverseRelation":"narrower","rdfPredicate":"http://www.w3.org/2004/02/skos/core#broader","isSymmetric":false,"isTransitive":true,"isCrossGlossaryAllowed":true,"category":"hierarchical","isSystemDefined":true,"color":"#597ef7"},{"name":"narrower","displayName":"Narrower","description":"A more specific term (hyponym).","inverseRelation":"broader","rdfPredicate":"http://www.w3.org/2004/02/skos/core#narrower","isSymmetric":false,"isTransitive":true,"isCrossGlossaryAllowed":true,"category":"hierarchical","isSystemDefined":true,"color":"#85a5ff"},{"name":"partOf","displayName":"Part Of","description":"This term is a part or component of another term.","inverseRelation":"hasPart","rdfPredicate":"https://open-metadata.org/ontology/partOf","isSymmetric":false,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"hierarchical","isSystemDefined":true,"color":"#13c2c2"},{"name":"hasPart","displayName":"Has Part","description":"This term has the other term as a part or component.","inverseRelation":"partOf","rdfPredicate":"https://open-metadata.org/ontology/hasPart","isSymmetric":false,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"hierarchical","isSystemDefined":true,"color":"#36cfc9"},{"name":"calculatedFrom","displayName":"Calculated From","description":"This term/metric is calculated or derived from another term.","inverseRelation":"usedToCalculate","rdfPredicate":"https://open-metadata.org/ontology/calculatedFrom","isSymmetric":false,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"associative","isSystemDefined":true,"color":"#faad14"},{"name":"usedToCalculate","displayName":"Used To Calculate","description":"This term is used in the calculation of another term.","inverseRelation":"calculatedFrom","rdfPredicate":"https://open-metadata.org/ontology/usedToCalculate","isSymmetric":false,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"associative","isSystemDefined":true,"color":"#ffc53d"},{"name":"seeAlso","displayName":"See Also","description":"Related term that may provide additional context.","rdfPredicate":"http://www.w3.org/2000/01/rdf-schema#seeAlso","isSymmetric":true,"isTransitive":false,"isCrossGlossaryAllowed":true,"category":"associative","isSystemDefined":true,"color":"#eb2f96"}]}'::jsonb
WHERE NOT EXISTS (
SELECT 1 FROM openmetadata_settings WHERE configtype = 'glossaryTermRelationSettings'
);
fix: strip stale relatedTerms from glossary_term_entity JSON to fix 500 on listAfter (#26586) * fix: strip stale relatedTerms from glossary_term_entity JSON to fix 500 on listAfter Pre-1.13.0, relatedTerms was stored as EntityReference[] directly in the glossary_term_entity JSON column. PR #25886 changed relatedTerms to TermRelation[] and moved storage to entity_relationship table, but missed adding a migration to clean up the old EntityReference data still present in existing rows. When listAfter() deserializes the entity JSON, Jackson fails with: UnrecognizedPropertyException: Unrecognized field "id" (class TermRelation) The existing migration already backfilled entity_relationship rows with relationType="relatedTo", so stripping relatedTerms from entity JSON is safe — the data is already in entity_relationship and will be loaded from there. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * fix: strip stale relatedTerms from glossary_term_entity JSON to fix 500 on listAfter Pre-1.13.0, relatedTerms was stored as EntityReference[] directly in the glossary_term_entity JSON column. PR #25886 changed relatedTerms to TermRelation[] and moved storage to entity_relationship table, but missed adding a migration to clean up the old EntityReference data still present in existing rows. When listAfter() deserializes the entity JSON, Jackson fails with: UnrecognizedPropertyException: Unrecognized field "id" (class TermRelation) The existing migration already backfilled entity_relationship rows with relationType="relatedTo", so stripping relatedTerms from entity JSON is safe — the data is already in entity_relationship and will be loaded from there. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Ram Narayan Balaji <81347100+yan-3005@users.noreply.github.com>
2026-03-20 04:59:26 +00:00
-- Strip stale relatedTerms from glossary term entity JSON.
-- relatedTerms is now loaded from entity_relationship table, not from entity JSON.
-- Old data stored relatedTerms as EntityReference objects which fail to deserialize as TermRelation.
UPDATE glossary_term_entity
SET json = (json::jsonb - 'relatedTerms')::json
WHERE jsonb_exists(json::jsonb, 'relatedTerms');
Glossary relations (#25886) * Glossary Term Relations * Add GlossaryTerm Relations * Add GlossaryTerm Relations, Add custom relations, onotolgoy explorer * Add Translations * Update generated TypeScript types * Address comments * Address comments * Address comments * Update generated TypeScript types * Update yarn.lock after merging cytoscape dependencies from glossary_relations * fix zoom in and out functionality and added missing translate keys * fix test * Remove unwanted changes * nit * nit * nit * Remove conflict test * nit * fix test * Add test for ontology explorer * New yarn lock and 2.0.0 schema changes missed during merge conflicts * Revamped glossary term relation settings * Refactor code * Addressed comments * nit * Update generated TypeScript types * Java Checkstyle and Yarn lock * Update generated TypeScript types * fix unit test * Remove 2.0.0 migration folders placed at wrong loc * Merge main * fix navigation to relation graph in glossary * fix ontology explorer spec * Added filter support in the data mode * Fix glossary term relation CI failures ### Canonical Relation Storage (GlossaryTermRepository) * Introduced `computeCanonicalRelationType()` to normalize relation direction using UUID ordering (lower UUID is always treated as "from") * Prevents duplicate and inconsistent relation rows when created from either side * Updated `setTermRelations()` and `addRelation()` to store canonical relation types * Fixed `setFields()` read logic: * Invert relation type for `fromRecords` (entity is the TO side) * Keep `toRecords` unchanged * Updated `deleteBidirectionalRelatedTo()` to match canonical storage format * Added `RequestEntityCache.invalidate()` after relation mutations to ensure consistency ### Lazy RDF Resource Initialization * Added `RdfRepository.getInstanceOrNull()` for null-safe access without throwing * Refactored `RdfResource` constructor to avoid eager `RdfRepository.getInstance()` call * Enabled resource registration even when Fuseki is not initialized * Introduced lazy getters: * `getRdfRepository()` * `getSemanticSearchEngine()` * Updated all endpoints to guard with null checks before `isEnabled()` * Return `503 Service Unavailable` when RDF is not ready ### Graceful Test Degradation (Fuseki-dependent tests) * Added `TestSuiteBootstrap.isFusekiEnabled()` to detect Fuseki availability * `GlossaryOntologyExportIT`: * Falls back to Testcontainers-based local Fuseki when bootstrap Fuseki is unavailable * `GlossaryTermRelationIT`: * Skipped via `assumeTrue` when Fuseki is unavailable * `MetricResourceIT`: * Skips RDF-specific tests when Fuseki is unavailable * fix package conflicts * nit * Fix merge conflicts, Python test, RDF reliability, and VectorDocBuilder tests - Fix Python test_patch_glossary_term_related_terms to use TermRelation instead of EntityReferenceList (schema changed relatedTerms type) - Rewrite VectorDocBuilder tests for current buildEmbeddingFields API - Improve JenaFusekiStorage retry logic to retry on all HTTP errors - Increase Fuseki tmpfs size to prevent disk space exhaustion in tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix pycheck * Address all 8 PR review findings 1. Add authorization check on getTermRelationGraph endpoint 2. Add null guard on getBaseUri() to prevent NPE 3. Add React key prop on RelatedTermTagButton in map renders 4. Mark RdfResource lazy-init fields as volatile for thread safety 5. Replace exception messages with generic errors in API responses 6. Unify DEFAULT_RELATION_TYPES between CSV and repository (10 types) 7. Add jitter backoff to deadlock retry in CollectionDAO 8. Replace N+1 queries in prefetchGraphTerms with batch fetch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix Fuseki tmpfs exhaustion and GlossaryTermRelationIT double init - Remove tmpfs size limit on Fuseki container to prevent disk exhaustion - Guard RdfUpdater.initialize() in GlossaryTermRelationIT to skip if already initialized by bootstrap Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix duplicate edges, null term NPE, and silent exception in graph builder - Deduplicate edges in buildGraph() using edgesSeen set - Skip TermRelation entries with null term references to prevent NPE - Add warning log when glossary term relation settings fail to load Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix cardinality count after canonical swap and double-checked locking - getRelationCount now matches inverse relation type for fromRecords where the term is the target, fixing cardinality bypass after bidirectional UUID canonicalization - Use double-checked locking in RdfResource.getSemanticSearchEngine() to prevent duplicate instance creation under concurrency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: anuj-kumary <anujf0510@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com> Co-authored-by: Ram Narayan Balaji <81347100+yan-3005@users.noreply.github.com>
2026-03-18 05:21:03 +00:00
-- Backfill conceptMappings for existing glossary terms
UPDATE glossary_term_entity
SET json = jsonb_set(COALESCE(json::jsonb, '{}'::jsonb), '{conceptMappings}', '[]'::jsonb)
WHERE json IS NULL OR json::jsonb->'conceptMappings' IS NULL;