OpenMetadata/openmetadata-integration-tests
Mohit Yadav c2e6d907dd
fix(lineage): service nodes appearing in entity lineage view and empty By Service view (#27258)
* fix(lineage): prevent pipeline annotation inheritance in service/domain/dataProduct lineage and add pipeline service edges

Bug #1: Service nodes (e.g., DatabaseService, MessagingService) were incorrectly appearing in
entity-level lineage views. Root cause: getOrCreateLineageDetails() in addServiceLineage(),
addDomainLineage(), and addDataProductsLineage() was copying the pipeline annotation from
entity-level LineageDetails to service/domain/dataProduct-level LineageDetails. This caused
service entities to have upstreamLineage.pipeline.fqnHash set in their Elasticsearch documents,
making them match the PIPELINE_AS_EDGE_KEY query during BFS traversal and incorrectly appear
alongside actual data assets. Fix: add .withPipeline(null) on each service/domain/dataProduct
LineageDetails object to strip the pipeline annotation before persisting.

Bug #2: "By Service" view was empty when viewing lineage for pipeline entities that were stored
as edge annotators (Case B: table → topic with pipeline=flink_pipeline in LineageDetails) rather
than as actual nodes (Case A). Root cause: addServiceLineage() only created database_service →
kafka_service edges but no edges involving flink_pipeline_service. Fix: add addPipelineServiceEdges()
called from addServiceLineage() that creates fromService → pipelineService and pipelineService →
toService edges when a pipeline annotation exists in the entity-level lineage details.

Also add unit tests covering both fixes to prevent regression.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): add migration to remove pipeline annotation from service/domain/dataProduct lineage edges

The previous fix (e6df7a6c62) prevented new lineage from inheriting pipeline annotations on
service/domain/dataProduct-level edges. However, existing data in the entity_relationship table
already has pipeline set on those edges from before the fix, and Elasticsearch reindex reads from
the DB — so reindex alone does not fix stale data.

This migration removes the pipeline field from all service-to-service, domain-to-domain, and
dataProduct-to-dataProduct lineage edges (relation=13/UPSTREAM) in entity_relationship.

After upgrading and running this migration, operators should trigger an Elasticsearch/OpenSearch
reindex so that the corrected DB records are reflected in the search index, which is what the
lineage graph BFS traversal reads from.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): move pipeline annotation migration from 1.12.0 to 1.13.0

Moves the data migration that removes the pipeline field from
service/domain/dataProduct lineage edges in entity_relationship to the
1.13.0 migration scripts, which is the correct target version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): move pipeline annotation migration from 1.13.0 to new 1.12.6

Creates a new 1.12.6 migration with the data fix that removes the pipeline
field from service/domain/dataProduct lineage edges in entity_relationship,
and removes it from 1.13.0 where it was previously placed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): add v1126 Java migration to create pipeline service edges for existing data

For installations upgrading to 1.12.6 with existing lineage data, service edges
fromService→pipelineService and pipelineService→toService were never created
(only added by the code fix for new lineage going forward). This migration
reads service-level lineage edges that have a pipeline annotation, resolves
the pipeline entity's service, and inserts the two missing service edges into
entity_relationship (DB only). After the SQL migration strips pipeline from
service edges and a reindex runs, the "By Service" lineage view for pipeline
services correctly shows their upstream/downstream service connections.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): fix v1126 migration to read entity-level edges for pipeline service creation

The original migration read service-level edges (databaseService→messagingService)
looking for pipeline annotations, but those had already been cleaned by the SQL
migration before the Java migration could run in subsequent server restarts.

Fix: read data-asset-level edges (table→topic etc.) which retain their pipeline
annotation permanently. For each such edge, resolve fromEntity.service,
toEntity.service, and pipeline.service, then create the two missing
pipelineService edges in entity_relationship.

Verified: after running the migration manually via direct SQL + OpenSearch update,
the By Service view for lineage_test_flink_svc correctly shows 3 nodes with
upstream (db_svc→flink_svc) and downstream (flink_svc→kafka_svc) edges.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): clean up pipeline service edges when entity lineage is deleted

When entity-level lineage (table→topic) is deleted, cleanUpExtendedLineage
only cleaned up fromService→toService (db_svc→kafka_svc) but left the new
pipeline service edges (db_svc→flink_svc, flink_svc→kafka_svc) as orphans
in both entity_relationship and OpenSearch.

Fix:
- Pass lineageDetails (which contains the pipeline reference) into
  cleanUpExtendedLineage from both deleteLineage and deleteLineageByFQN
- Add cleanUpPipelineServiceEdges that mirrors addPipelineServiceEdges:
  uses getPipelineService(lineageDetails) to resolve the pipelineService,
  then calls processExtendedLineageCleanup for fromService→pipelineService
  and pipelineService→toService edges (decrement assetEdges or delete+remove
  from search if count reaches zero)
- Also fix deleteLineageByFQN which was missing cleanUpExtendedLineage call
  entirely (pre-existing gap for service edge cleanup)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(lineage): add unit tests for pipeline annotation stripping and pipeline service edge creation

- Add 4 new unit tests to LineageRepositoryTest covering:
  - Bug #1 (2 tests): service-level edges do not inherit pipeline annotation
    from entity lineage, both for new and existing edges
  - Bug #2 (2 tests): addPipelineServiceEdges creates fromService→pipelineService
    and pipelineService→toService edges when pipeline annotator is present,
    and skips them when no pipeline is set
- Fix MySQL migration: add metadataService to entity type list (was in Java
  migration's SERVICE_ENTITY_TYPES but missing from SQL) and replace
  JSON_EXTRACT IS NOT NULL with JSON_CONTAINS_PATH to correctly handle both
  present and explicit-null pipeline fields
- Fix PostgreSQL migration: add metadataService to entity type list

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(lineage): add integration tests for pipeline-as-annotator lineage scenario

Tests Bug #1 (service nodes absent from entity-level lineage) and Bug #2
(pipeline service connected in service-level lineage) using a table → topic
edge annotated with a pipeline entity reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(e2e): add Playwright tests for pipeline-as-annotator lineage scenario

Tests Bug #1 (service nodes absent from entity-level lineage) and Bug #2
(pipeline service appears in service-level lineage) using API interception
and direct request assertions via page.request.get().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: apply spotless formatting to LineageRepositoryTest

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: apply prettier formatting to LineagePipelineAnnotator spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): guard against null LineageDetails in getPipelineService

When the json column in entity_relationship is NULL, JsonUtils.readValue
returns null. getPipelineService now short-circuits on a null argument
instead of throwing NullPointerException via entityLineageDetails.getPipeline().

Fixes NPE in deleteLineageByFQN and deleteLineage cleanup paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(e2e): use authenticated apiContext for service lineage assertions

page.request.get() sends browser cookies but OpenMetadata authenticates
via JWT in localStorage, so those calls were unauthenticated (non-2xx).
Replace with getToken + getAuthContext pattern used elsewhere.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(migration): add driveService to 1.12.6 pipeline annotation cleanup

Directory, File, Spreadsheet, and Worksheet entities map to driveService,
so service-level lineage edges between driveService instances could also
have incorrectly inherited the pipeline annotation. Include driveService
in the 1.12.6 cleanup migration for both MySQL and PostgreSQL.

Also drops the stray trailing-newline changes from the 1.12.0 migration
files — those edits were unnecessary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* new line remove

* fix(migration): add DRIVE_SERVICE to v1126 SERVICE_ENTITY_TYPES set

driveService-to-driveService edges must be skipped during the pipeline
service edge migration scan, same as all other service-level edges.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(migration): resolve merge conflict in v1126 MigrationUtil

The rebase left MigrationUtil with duplicate imports and a missing closing
brace on insertEdgeIfMissing. Merged both method sets cleanly and ran
spotless.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 00:55:16 -07:00
..
src/test fix(lineage): service nodes appearing in entity lineage view and empty By Service view (#27258) 2026-04-17 00:55:16 -07:00
K8S_TESTS.md Code cleanup based on IDE flagged warnings (#26808) 2026-03-27 06:17:01 -07:00
pom.xml Add Json Logging (#26357) 2026-03-31 16:15:07 -07:00
README.md Faster tests (#24948) 2025-12-26 23:47:49 -08:00
TEST_MIGRATION_TRACKER.md Faster tests (#24948) 2025-12-26 23:47:49 -08:00

OpenMetadata Integration Tests

This module contains SDK-based integration tests that run against a real OpenMetadata server using Testcontainers. Tests execute in parallel and are isolated using TestNamespace.

Quick Start

# Run all tests with MySQL + Elasticsearch (default)
mvn test -pl :openmetadata-integration-tests

# Run with PostgreSQL + OpenSearch
mvn test -pl :openmetadata-integration-tests -Ppostgres-opensearch

# Run a specific test
mvn test -pl :openmetadata-integration-tests -Dtest="TableResourceIT"

Available Profiles

Profile Database Search Engine
mysql-elasticsearch (default) MySQL 8.3.0 Elasticsearch 8.11.4
postgres-opensearch PostgreSQL 15 OpenSearch 2.19.0
postgres-elasticsearch PostgreSQL 15 Elasticsearch 8.11.4
mysql-opensearch MySQL 8.3.0 OpenSearch 2.19.0

Writing a New Integration Test

1. Create the Test Class

Extend BaseEntityIT for entity CRUD tests or create a standalone test class:

package org.openmetadata.it.tests;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertNotNull;

import org.junit.jupiter.api.Test;
import org.openmetadata.it.util.TestNamespace;
import org.openmetadata.schema.api.data.CreateTable;
import org.openmetadata.schema.entity.data.Table;

public class MyFeatureIT extends BaseEntityIT<Table, CreateTable> {

  @Override
  protected String getEntityType() {
    return "table";
  }

  @Override
  protected Table createEntity(CreateTable request) {
    return SdkClients.adminClient().tables().create(request);
  }

  @Override
  protected CreateTable createRequest(String name, TestNamespace ns) {
    return new CreateTable()
        .withName(name)
        .withDatabaseSchema(SharedEntities.getSchema().getFullyQualifiedName())
        .withColumns(List.of(new Column().withName("id").withDataType(ColumnDataType.INT)));
  }

  @Test
  void myCustomTest(TestNamespace ns) throws Exception {
    CreateTable request = createRequest(ns.prefix("myTable"), ns);
    Table table = createEntity(request);

    assertNotNull(table.getId());
    assertEquals(request.getName(), table.getName());
  }
}

2. Key Concepts

TestNamespace

Every test method receives a TestNamespace parameter that provides unique prefixes for entity names:

@Test
void myTest(TestNamespace ns) {
  String uniqueName = ns.prefix("myEntity");  // e.g., "abc123_myEntity"
}

This ensures tests don't conflict when running in parallel.

SdkClients

Get pre-configured SDK clients for different users:

OpenMetadataClient adminClient = SdkClients.adminClient();
OpenMetadataClient user1Client = SdkClients.user1Client();
OpenMetadataClient botClient = SdkClients.botClient();

SharedEntities

Access pre-created entities for tests:

DatabaseService service = SharedEntities.getService();
Database database = SharedEntities.getDatabase();
DatabaseSchema schema = SharedEntities.getSchema();
User adminUser = SharedEntities.getAdminUser();

3. BaseEntityIT Features

When extending BaseEntityIT, you get these tests automatically:

Test Description
post_entityCreate_200 Create entity successfully
get_entity_200_OK Get entity by ID
get_entityByName_200 Get entity by FQN
get_entityNotFound_404 Get non-existent entity
put_entityCreate_200 Create via PUT
patch_entityAttributes_200 Patch entity attributes
delete_entityAsAdmin_200 Delete entity
get_entityListWithPagination_200 List with pagination
test_sdkCRUDOperations Full CRUD via SDK
... and 30+ more

4. Controlling Test Behavior

Use flags to customize which inherited tests run:

public class MyEntityIT extends BaseEntityIT<MyEntity, CreateMyEntity> {
  {
    supportsPatch = true;          // Enable PATCH tests
    supportsTags = true;           // Enable tag tests
    supportsOwner = true;          // Enable owner tests
    supportsSearchIndex = true;    // Enable search tests
    supportsDomains = true;        // Enable domain tests
  }
}

5. Best Practices

  1. Use TestNamespace.prefix() for all entity names to ensure uniqueness
  2. Don't clean up entities - TestNamespace isolation handles this
  3. Use specific imports - No wildcard imports (import static ....*)
  4. Keep tests independent - Don't rely on order of execution
  5. Use Awaitility for async operations - Not Thread.sleep()
Awaitility.await()
    .atMost(Duration.ofSeconds(30))
    .pollInterval(Duration.ofMillis(500))
    .until(() -> someCondition());
  1. Avoid single-line comments - Write self-documenting code

Project Structure

openmetadata-integration-tests/
├── src/test/java/org/openmetadata/it/
│   ├── auth/           # JWT token generation
│   ├── env/            # Test infrastructure (TestSuiteBootstrap)
│   ├── factories/      # Entity factory classes
│   ├── tests/          # Integration test classes
│   └── util/           # Utilities (SdkClients, TestNamespace)
└── src/test/resources/
    ├── openmetadata-secure-test.yaml  # Test config
    └── *.der                          # JWT keys

Test Infrastructure

Tests use TestSuiteBootstrap (a JUnit LauncherSessionListener) that:

  1. Starts database container (MySQL or PostgreSQL)
  2. Starts search container (Elasticsearch or OpenSearch)
  3. Starts Fuseki SPARQL container (for RDF tests)
  4. Starts the OpenMetadata application
  5. Initializes SharedEntities

All containers are started once per test run and shared across all tests.

Running in CI

GitHub workflows run these tests on every PR:

  • integration-tests-mysql-elasticsearch.yml - MySQL + Elasticsearch
  • integration-tests-postgres-opensearch.yml - PostgreSQL + OpenSearch

Tests require the "safe to test" label on PRs.