* Containers: FQN-driven hierarchy listings + cascade-delete orphan fix
Stops `?root=true&service=...` and `/containers/.../children` from leaking
deeply-nested orphans, fixes the source bug that produced them, and corrects
the 1.13.0 fqnHash pattern index opclass.
Listing path
- ListFilter.getFqnPrefixCondition now binds both <param>Hash and
<param>HashChild ('<hash>.%' and '<hash>.%.%') so depth-aware listings
can require "exactly one segment below the prefix" via a single LIKE +
NOT LIKE pair on fqnHash. Same shape works at any tree depth.
- ContainerDAO.listRoot{Before,After,Count} swap the NOT EXISTS anti-join
on entity_relationship for fqnHash NOT LIKE :serviceHashChild. The FQN
is the canonical hierarchy in OpenMetadata; the relationship table is
no longer consulted for hierarchical listings.
- ContainerRepository.listChildren rewritten: no parent-by-name lookup, no
findToWithOffset/countFindTo on entity_relationship, no second-hop
hydration. Single SQL roundtrip + slim projection via
listDirectChildSummariesByParentHash. Orphans whose parent CONTAINS row
is missing are now correctly placed under their FQN-implied parent.
- Both endpoints honour ?include=non-deleted|all|deleted; ChildrenPageCache
key includes the include tag so toggling the UI Deleted switch doesn't
return a stale page from the other side.
- ContainerResource.listChildren accepts ?include= for parity with the
root listing.
Cascade-delete orphan source (EntityRepository.processDeletionBatch)
- Removed the redundant pre-batch-delete of relationships and the
swallow-all try/catch in the per-child loop. cleanup() per entity now
owns row removal AND relationship deletion atomically; exceptions
propagate so the loop stops on first failure with per-child atomicity.
Stops the orphan-without-relationships pattern that the listing change
defends against.
Migration correction (1.13.0 postgres fqnHash pattern indexes)
- Recreate 23 idx_*_fqnhash_pattern indexes with text_pattern_ops instead
of varchar_pattern_ops. The planner casts the column to text when the
LIKE RHS is text-typed (every JDBC setString call), so
varchar_pattern_ops doesn't match the resulting (fqnhash)::text ~~
expression. Confirmed via EXPLAIN ANALYZE on a 580k-row table: the same
query drops from ~470ms cold (Parallel Seq Scan) to <1ms (Index Scan).
Tests
- ListFilterTest: 3 unit tests covering both binds, dotted/quoted service
name special-char handling, and include= flowing through alongside the
service prefix.
- ContainerResourceIT: 8 integration tests covering depth correctness at
every level (5-level chain), orphan exclusion at root, orphan
discoverability under FQN-implied parent, sibling subtree isolation,
the include toggle on both endpoints, and large-batch hard-delete
leaving no orphan rows or relationships.
Closes #27870 (subset of its listing-side intent shipped here as a single
FQN-depth predicate; PR's cascade fix and both new tests picked up
verbatim).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Address review comments on #27878
- ContainerDAO.listRoot* override now defaults :serviceHashChild to '%.%.%'
via rootListingParams() when ?service= is absent. Previous code
unconditionally referenced the bind, so ?root=true without a service
filter crashed at runtime with a missing-named-parameter error.
- Migration 1.13.0/postgres/schemaChanges.sql now DROP INDEX CONCURRENTLY
IF EXISTS before each CREATE so already-upgraded environments (which
have the original varchar_pattern_ops indexes) get the index recreated
with text_pattern_ops on next deploy. Fresh installs see the DROP as
a no-op. Comment block updated to record the recreate intent.
- ChildrenPageCache include tag for ALL changed from "all" to "a" so the
CacheKeys.childrenPage Javadoc's "1-2 char" promise holds (now nd/a/d
are all <=2 chars).
- ContainerRepository.includeToBindString Javadoc corrected: it described
the SQL as a CASE expression, but listDirectChildSummariesByParentHash
actually uses a three-branch OR chain.
- ListFilterTest: added test_noServiceFilter_doesNotBindServicePatterns
as a regression guard for the missing-bind bug.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Fix java style
* Address second review pass on #27878
- EntityRepository.processDeletionBatch wraps per-child cleanup exceptions
with entityType + entityId context before re-throwing. The exception
still propagates (so the loop still stops, failure-semantics contract
unchanged); operators now get a stack trace that names the row that
blocked a large recursive delete instead of an opaque error.
- CacheKeys.childrenPage Javadoc now lists the actual include tags
("nd" / "a" / "d") and points at ChildrenPageCache.includeTag as the
authoritative source. Earlier comment still mentioned "all" after the
switch to single-letter tags.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Test: ?root=true without service filter end-to-end (#27878 review)
Adds test_rootListing_withoutServiceFilter_returnsRootsAcrossAllServices
to ContainerResourceIT. Creates two distinct storage services, each with
a root container and a child container, then asserts that GET
/containers?root=true (no service filter):
- Succeeds (rootListingParams() defaults :serviceHashChild to '%.%.%' so
the SQL has its bind even when ListFilter.getServiceCondition didn't
add it).
- Includes root containers from both services (cross-service listing
works without a service prefix narrowing the candidate set).
- Excludes child containers from either service (depth check still
applied via the default bind).
Regression guard for the bug Copilot's review pass flagged at
CollectionDAO.java:784: 'GET /containers?root=true (no service) crashes
at runtime due to a missing named parameter.'
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Use generated name column instead of JSON extract in container summary queries
storage_container_entity has 'name' as a STORED generated column derived
from json->>'name' (see bootstrap/sql/schema/postgres.sql). Both slim
projection queries (findContainerSummaryRows and listDirectChildSummariesByParentHash)
were redundantly extracting it via JSON_UNQUOTE(JSON_EXTRACT(...)) on MySQL
and json->>'name' on Postgres — work the database had already done at insert
time.
Reading 'name' as a column directly:
- Saves one JSON op per row on every page fetch
- Lets ORDER BY name sort on the indexed generated column rather than a
per-row JSON-extracted expression
displayName, fullyQualifiedName, and description stay as JSON extracts —
they aren't generated columns. (description in particular shouldn't be:
free-text fields can be many KB and a STORED generated column would
double the row size on disk.)
Row mapper unchanged — column labels in the SELECT list still match.
* Fix inaccurate ListFilterTest comment and Javadoc link to private method
ListFilterTest: the prefix-pattern comment said the LIKE patterns 'exclude'
direct/grandchildren — patterns themselves match, the SQL's NOT LIKE is
what excludes. Rewrote to show how ContainerDAO.listRoot* combines LIKE
and NOT LIKE on the two binds.
CacheKeys.childrenPage: the @link pointed at ChildrenPageCache#includeTag
which is private static; Javadoc tooling renders that as an unresolved
link. Redirected to the public Include enum the tag is derived from.
* Log original exception in recursive batch delete catch before wrapping
Wrapping the caught RuntimeException into a new one (with entity context
in the message) preserves the original via the cause chain, but the outer
exception mapper sees the wrapper and renders a generic 500 — the original
type information doesn't surface to operators investigating a failed
delete.
Adds a LOG.error before the wrap so the original exception (with full type
and stack) lands in the logs adjacent to the entity context, giving
operators enough signal to diagnose what actually blocked the delete.
* Restore failure-semantics comment block on recursive batch delete wrap
* use Entity.SEPARATOR instead of hard-coding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix check style
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
||
|---|---|---|
| .. | ||
| src/test | ||
| K8S_TESTS.md | ||
| pom.xml | ||
| README.md | ||
| TEST_MIGRATION_TRACKER.md | ||
OpenMetadata Integration Tests
This module contains SDK-based integration tests that run against a real OpenMetadata server using Testcontainers. Tests execute in parallel and are isolated using TestNamespace.
Quick Start
# Run all tests with MySQL + Elasticsearch (default)
mvn test -pl :openmetadata-integration-tests
# Run with PostgreSQL + OpenSearch
mvn test -pl :openmetadata-integration-tests -Ppostgres-opensearch
# Run a specific test
mvn test -pl :openmetadata-integration-tests -Dtest="TableResourceIT"
Available Profiles
| Profile | Database | Search Engine |
|---|---|---|
mysql-elasticsearch (default) |
MySQL 8.3.0 | Elasticsearch 8.11.4 |
postgres-opensearch |
PostgreSQL 15 | OpenSearch 2.19.0 |
postgres-elasticsearch |
PostgreSQL 15 | Elasticsearch 8.11.4 |
mysql-opensearch |
MySQL 8.3.0 | OpenSearch 2.19.0 |
Writing a New Integration Test
1. Create the Test Class
Extend BaseEntityIT for entity CRUD tests or create a standalone test class:
package org.openmetadata.it.tests;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import org.junit.jupiter.api.Test;
import org.openmetadata.it.util.TestNamespace;
import org.openmetadata.schema.api.data.CreateTable;
import org.openmetadata.schema.entity.data.Table;
public class MyFeatureIT extends BaseEntityIT<Table, CreateTable> {
@Override
protected String getEntityType() {
return "table";
}
@Override
protected Table createEntity(CreateTable request) {
return SdkClients.adminClient().tables().create(request);
}
@Override
protected CreateTable createRequest(String name, TestNamespace ns) {
return new CreateTable()
.withName(name)
.withDatabaseSchema(SharedEntities.getSchema().getFullyQualifiedName())
.withColumns(List.of(new Column().withName("id").withDataType(ColumnDataType.INT)));
}
@Test
void myCustomTest(TestNamespace ns) throws Exception {
CreateTable request = createRequest(ns.prefix("myTable"), ns);
Table table = createEntity(request);
assertNotNull(table.getId());
assertEquals(request.getName(), table.getName());
}
}
2. Key Concepts
TestNamespace
Every test method receives a TestNamespace parameter that provides unique prefixes for entity names:
@Test
void myTest(TestNamespace ns) {
String uniqueName = ns.prefix("myEntity"); // e.g., "abc123_myEntity"
}
This ensures tests don't conflict when running in parallel.
SdkClients
Get pre-configured SDK clients for different users:
OpenMetadataClient adminClient = SdkClients.adminClient();
OpenMetadataClient user1Client = SdkClients.user1Client();
OpenMetadataClient botClient = SdkClients.botClient();
SharedEntities
Access pre-created entities for tests:
DatabaseService service = SharedEntities.getService();
Database database = SharedEntities.getDatabase();
DatabaseSchema schema = SharedEntities.getSchema();
User adminUser = SharedEntities.getAdminUser();
3. BaseEntityIT Features
When extending BaseEntityIT, you get these tests automatically:
| Test | Description |
|---|---|
post_entityCreate_200 |
Create entity successfully |
get_entity_200_OK |
Get entity by ID |
get_entityByName_200 |
Get entity by FQN |
get_entityNotFound_404 |
Get non-existent entity |
put_entityCreate_200 |
Create via PUT |
patch_entityAttributes_200 |
Patch entity attributes |
delete_entityAsAdmin_200 |
Delete entity |
get_entityListWithPagination_200 |
List with pagination |
test_sdkCRUDOperations |
Full CRUD via SDK |
| ... and 30+ more |
4. Controlling Test Behavior
Use flags to customize which inherited tests run:
public class MyEntityIT extends BaseEntityIT<MyEntity, CreateMyEntity> {
{
supportsPatch = true; // Enable PATCH tests
supportsTags = true; // Enable tag tests
supportsOwner = true; // Enable owner tests
supportsSearchIndex = true; // Enable search tests
supportsDomains = true; // Enable domain tests
}
}
5. Best Practices
- Use
TestNamespace.prefix()for all entity names to ensure uniqueness - Don't clean up entities - TestNamespace isolation handles this
- Use specific imports - No wildcard imports (
import static ....*) - Keep tests independent - Don't rely on order of execution
- Use Awaitility for async operations - Not
Thread.sleep()
Awaitility.await()
.atMost(Duration.ofSeconds(30))
.pollInterval(Duration.ofMillis(500))
.until(() -> someCondition());
- Avoid single-line comments - Write self-documenting code
Project Structure
openmetadata-integration-tests/
├── src/test/java/org/openmetadata/it/
│ ├── auth/ # JWT token generation
│ ├── env/ # Test infrastructure (TestSuiteBootstrap)
│ ├── factories/ # Entity factory classes
│ ├── tests/ # Integration test classes
│ └── util/ # Utilities (SdkClients, TestNamespace)
└── src/test/resources/
├── openmetadata-secure-test.yaml # Test config
└── *.der # JWT keys
Test Infrastructure
Tests use TestSuiteBootstrap (a JUnit LauncherSessionListener) that:
- Starts database container (MySQL or PostgreSQL)
- Starts search container (Elasticsearch or OpenSearch)
- Starts Fuseki SPARQL container (for RDF tests)
- Starts the OpenMetadata application
- Initializes
SharedEntities
All containers are started once per test run and shared across all tests.
Running in CI
GitHub workflows run these tests on every PR:
integration-tests-mysql-elasticsearch.yml- MySQL + Elasticsearchintegration-tests-postgres-opensearch.yml- PostgreSQL + OpenSearch
Tests require the "safe to test" label on PRs.