Commit graph

412 commits

Author SHA1 Message Date
Sriharsha Chintalapani
bb0daa180e
RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex (#26902)
* RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex

* Update generated TypeScript types

* Address comments from copilot

* Update generated TypeScript types

* fix test issues

* Fix minor UI bugs

* Add the missing filters

* Fix RDF export API error

* Add export functionality

* Fix ui-checkstyle

* Fix java checkstyle

* Fix unit tests

* Fix and increase the coverage for KnowledgeGraph.spec.ts

* Fix tests

* Remove rdf as default in playwright and local docker

* fix ui-checkstyle

* Address comments

* Potential fix for pull request finding 'CodeQL / Artifact poisoning'

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Address copilot comments

* Address copilot comments

* FIx tests

* FIx docker

* Update openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/rdf/distributed/DistributedRdfIndexCoordinator.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Address copilot review comments: license headers, JSON escaping, type safety, border-color, stop semantics

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/c026e52e-162b-4c9a-9874-43791d4aaac1

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

* Show error toast for unsupported export format in KnowledgeGraph

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/c026e52e-162b-4c9a-9874-43791d4aaac1

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

* Fix docker

* Fix docker for playwright

* Fix docker for playwright

* Fix tests

* Fix tests

* Fix docker

* Fix docker

* Fix glossary and pagination spec flakiness

* update the missing translations

* Fix docker

* Fix docker

* Fix integration test

* Fix fuseki not starting

* Fixed the run local docker script

* worked on comments

* Fix flakiness in knowledge graph tests

* Fix checkstyle

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harshach <38649+harshach@users.noreply.github.com>
2026-04-14 13:24:41 -07:00
Sriharsha Chintalapani
6d99ba2dc0
Glossary relations (#25886)
* Glossary Term Relations

* Add GlossaryTerm Relations

* Add GlossaryTerm Relations, Add custom relations, onotolgoy explorer

* Add Translations

* Update generated TypeScript types

* Address comments

* Address comments

* Address comments

* Update generated TypeScript types

* Update yarn.lock after merging cytoscape dependencies from glossary_relations

* fix zoom in and out functionality and added missing translate keys

* fix test

* Remove unwanted changes

* nit

* nit

* nit

* Remove conflict test

* nit

* fix test

* Add test for ontology explorer

* New yarn lock and 2.0.0 schema changes missed during merge conflicts

* Revamped glossary term relation settings

* Refactor code

* Addressed comments

* nit

* Update generated TypeScript types

* Java Checkstyle and Yarn lock

* Update generated TypeScript types

* fix unit test

* Remove 2.0.0 migration folders placed at wrong loc

* Merge main

* fix navigation to relation graph in glossary

* fix ontology explorer spec

* Added filter support in the data mode

* Fix glossary term relation CI failures

### Canonical Relation Storage (GlossaryTermRepository)

* Introduced `computeCanonicalRelationType()` to normalize relation direction
  using UUID ordering (lower UUID is always treated as "from")
* Prevents duplicate and inconsistent relation rows when created from either side
* Updated `setTermRelations()` and `addRelation()` to store canonical relation types
* Fixed `setFields()` read logic:

  * Invert relation type for `fromRecords` (entity is the TO side)
  * Keep `toRecords` unchanged
* Updated `deleteBidirectionalRelatedTo()` to match canonical storage format
* Added `RequestEntityCache.invalidate()` after relation mutations to ensure consistency

### Lazy RDF Resource Initialization

* Added `RdfRepository.getInstanceOrNull()` for null-safe access without throwing
* Refactored `RdfResource` constructor to avoid eager `RdfRepository.getInstance()` call
* Enabled resource registration even when Fuseki is not initialized
* Introduced lazy getters:

  * `getRdfRepository()`
  * `getSemanticSearchEngine()`
* Updated all endpoints to guard with null checks before `isEnabled()`

  * Return `503 Service Unavailable` when RDF is not ready

### Graceful Test Degradation (Fuseki-dependent tests)

* Added `TestSuiteBootstrap.isFusekiEnabled()` to detect Fuseki availability
* `GlossaryOntologyExportIT`:

  * Falls back to Testcontainers-based local Fuseki when bootstrap Fuseki is unavailable
* `GlossaryTermRelationIT`:

  * Skipped via `assumeTrue` when Fuseki is unavailable
* `MetricResourceIT`:

  * Skips RDF-specific tests when Fuseki is unavailable

* fix package conflicts

* nit

* Fix merge conflicts, Python test, RDF reliability, and VectorDocBuilder tests

- Fix Python test_patch_glossary_term_related_terms to use TermRelation
  instead of EntityReferenceList (schema changed relatedTerms type)
- Rewrite VectorDocBuilder tests for current buildEmbeddingFields API
- Improve JenaFusekiStorage retry logic to retry on all HTTP errors
- Increase Fuseki tmpfs size to prevent disk space exhaustion in tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix pycheck

* Address all 8 PR review findings

1. Add authorization check on getTermRelationGraph endpoint
2. Add null guard on getBaseUri() to prevent NPE
3. Add React key prop on RelatedTermTagButton in map renders
4. Mark RdfResource lazy-init fields as volatile for thread safety
5. Replace exception messages with generic errors in API responses
6. Unify DEFAULT_RELATION_TYPES between CSV and repository (10 types)
7. Add jitter backoff to deadlock retry in CollectionDAO
8. Replace N+1 queries in prefetchGraphTerms with batch fetch

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix Fuseki tmpfs exhaustion and GlossaryTermRelationIT double init

- Remove tmpfs size limit on Fuseki container to prevent disk exhaustion
- Guard RdfUpdater.initialize() in GlossaryTermRelationIT to skip if
  already initialized by bootstrap

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix duplicate edges, null term NPE, and silent exception in graph builder

- Deduplicate edges in buildGraph() using edgesSeen set
- Skip TermRelation entries with null term references to prevent NPE
- Add warning log when glossary term relation settings fail to load

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix cardinality count after canonical swap and double-checked locking

- getRelationCount now matches inverse relation type for fromRecords
  where the term is the target, fixing cardinality bypass after
  bidirectional UUID canonicalization
- Use double-checked locking in RdfResource.getSemanticSearchEngine()
  to prevent duplicate instance creation under concurrency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: anuj-kumary <anujf0510@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com>
Co-authored-by: Ram Narayan Balaji <81347100+yan-3005@users.noreply.github.com>
2026-03-18 10:51:03 +05:30
Mohit Yadav
21750aaa90
Feature/search indexing issues (#25594)
* Add design doc for search indexing stats redesign

Covers:
- Simplified 4-stage pipeline model (Reader, Process, Sink, Vector)
- Per-entity index promotion instead of batch promotion
- Alias management from indexMapping.json
- Payload-aware vector bulk processor

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add Support for Per Entity Index Promotion

* Add UI Bit

* Add Lang

* Add AppLog View Test coverage

* Add Bathced Vector index querying

* Add Improvements for Vector to be async and also stats to be better handled

* Use Virtual Thread

* Use Virtual Thread

* Fix Tests

* Make reading stats easier

* Fixed Stats to be accurate

* Fix Stats getting null

* Fix partition worker stats

* Fix Reader Stats - final

* Update generated TypeScript types

* Make updates in 1.12.0

* Revert "Use Virtual Thread"

This reverts commit 4eb23374d1.

* Revert "Use Virtual Thread"

This reverts commit efe8d03b5d.

* Reapply "Use Virtual Thread"

This reverts commit d59cde18b2.

* Reapply "Use Virtual Thread"

This reverts commit 769e5710c3.

* Fix Final Update on stat

* - Add atomic alias swap
- remove unnecessary migration

* Fix Sonar test jest

* Fix Final Update on stat

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-29 18:50:39 +05:30
Sriharsha Chintalapani
43f85a8969
Add RDF local dev (#24825)
* Add RDF local dev

* remove doc

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2025-12-15 10:49:13 +01:00
Pere Miquel Brull
c9cffa00db
Update roadmap (#6440)
* remove docs dir

* Update roadmap
2022-07-30 09:40:05 -07:00
Ayush Shah
fc2bd386a6
Clean gitbook from main (#5007) 2022-05-17 23:29:47 +05:30
Shannon Bradshaw
a2151e473d GitBook: [#182] Correct advanced search text 2022-04-10 21:12:00 -07:00
OpenMetadata
dbe6b641ac GitBook: [#179] No subject 2022-04-10 21:12:00 -07:00
OpenMetadata
694eba2799 GitBook: [#178] No subject 2022-04-10 21:12:00 -07:00
Shannon Bradshaw
383bca1315 GitBook: [#177] Fix image for advanced search 2022-04-10 21:12:00 -07:00
OpenMetadata
7a65d27010 GitBook: [#174] Update Kubernetes Docs 2022-04-10 21:11:58 -07:00
OpenMetadata
2a4c894f14 GitBook: [#175] No subject 2022-04-10 21:11:34 -07:00
Shannon Bradshaw
9b7bc505d7 GitBook: [#173] Fix TOC links for snowflake metadata ingestion 2022-04-10 21:11:34 -07:00
Shannon Bradshaw
7b9c48e674 GitBook: [#172] Separate Snowflake UI docs 2022-04-10 21:11:34 -07:00
Shannon Bradshaw
b219e20e3d GitBook: [#168] General cleanup for snowflake metadata ingestion docs 2022-04-10 21:11:34 -07:00
Shilpa V
8618f9c669 GitBook: [#171] Deleting service_type 2022-04-10 21:11:33 -07:00
pmbrull
1baf1bc310 GitBook: [#170] SQLAlchemy constraint 2022-04-10 21:11:33 -07:00
pmbrull
cc794b780b GitBook: [#169] Lineage Airflow 1.10.15 2022-04-10 21:11:33 -07:00
Shilpa V
1df5a4e52e GitBook: [#167] MySQL Updates 2022-04-10 21:11:33 -07:00
Shannon Bradshaw
06ee54e5ca GitBook: [#166] No subject 2022-04-10 21:11:33 -07:00
Shilpa V
971c9aad90 GitBook: [#162] MSSQL updates 2022-04-10 21:11:33 -07:00
Shannon Bradshaw
090ded1fd7 GitBook: [#163] Update Try OpenMetadata in Docker with latest success output messaging 2022-04-10 21:11:33 -07:00
Shilpa V
d9b5197e24 GitBook: [#161] MLflow Updates 2022-04-10 21:11:32 -07:00
Shilpa V
6b2d406439 GitBook: [#160] Glue Updates 2022-04-10 21:11:32 -07:00
Shilpa V
5f2a2ef49b GitBook: [#159] Glue 2022-04-10 21:11:32 -07:00
Shilpa V
d4d291008a GitBook: [#157] Glue Changes 2022-04-10 21:11:32 -07:00
Shannon Bradshaw
2412a436ea GitBook: [#156] Add procedure TOC to BigQuery UI page 2022-04-10 21:11:32 -07:00
Shannon Bradshaw
ef8bae6708 GitBook: [#155] Add BigQuery UI config page 2022-04-10 21:11:32 -07:00
Shannon Bradshaw
0214668096 GitBook: [#154] Fix broken link to Try OpenMetadata in Docker 2022-04-10 21:11:32 -07:00
Shilpa V
cd65905b54 GitBook: [#153] 3 Tab Connector Steps - Changes 2022-04-10 21:11:32 -07:00
Shilpa V
3f5aa6391b GitBook: [#152] Usage - Edits 2022-04-10 21:11:31 -07:00
Shilpa V
ed3def7de2 GitBook: [#151] MSSQL Usage Edits 2022-04-10 21:11:31 -07:00
Shilpa V
123149655a GitBook: [#149] Delta Lake changes 2022-04-10 21:11:31 -07:00
Shilpa V
c605819368 GitBook: [#148] Delta Lake Changes 2022-04-10 21:11:31 -07:00
Shilpa V
346b72b569 GitBook: [#129] New Connectors 2022-04-10 21:11:31 -07:00
OpenMetadata
82bad2cc1f GitBook: [#147] No subject 2022-04-10 21:11:31 -07:00
OpenMetadata
f01e837658 GitBook: [#146] No subject 2022-04-10 21:11:31 -07:00
OpenMetadata
b157766a0f GitBook: [#145] No subject 2022-04-10 21:11:31 -07:00
OpenMetadata
61e0c453d3 GitBook: [#144] No subject 2022-04-10 21:11:30 -07:00
OpenMetadata
64ca190d25 GitBook: [#143] No subject 2022-04-10 21:11:30 -07:00
OpenMetadata
20765f145a GitBook: [#142] No subject 2022-04-10 21:11:30 -07:00
OpenMetadata
837a5a7a04 GitBook: [#141] No subject 2022-04-10 21:11:30 -07:00
OpenMetadata
6287a7435e GitBook: [#140] No subject 2022-04-10 21:11:30 -07:00
OpenMetadata
58d2572ee7 GitBook: [#139] No subject 2022-04-10 21:11:30 -07:00
OpenMetadata
7f770179cf GitBook: [#138] Refactor BigQuery Ingestion Workflow 2022-04-10 21:11:30 -07:00
OpenMetadata
3ff97e6dde GitBook: [#135] Remove tabs for metadata ingestion 2022-04-10 21:11:30 -07:00
OpenMetadata
7926cabda5 GitBook: [#137] No subject 2022-04-10 21:11:29 -07:00
OpenMetadata
67b54e8cf8 GitBook: [#136] No subject 2022-04-10 21:11:29 -07:00
OpenMetadata
21cf98c25c GitBook: [#134] No subject 2022-04-10 21:11:29 -07:00
OpenMetadata
53b44ee19c GitBook: [#132] No subject 2022-04-10 21:11:29 -07:00