Commit graph

524 commits

Author SHA1 Message Date
mohitdeuex
87e0c31621 fix(lineage): service nodes appearing in entity lineage view and empty By Service view (#27258)
* fix(lineage): prevent pipeline annotation inheritance in service/domain/dataProduct lineage and add pipeline service edges

Bug #1: Service nodes (e.g., DatabaseService, MessagingService) were incorrectly appearing in
entity-level lineage views. Root cause: getOrCreateLineageDetails() in addServiceLineage(),
addDomainLineage(), and addDataProductsLineage() was copying the pipeline annotation from
entity-level LineageDetails to service/domain/dataProduct-level LineageDetails. This caused
service entities to have upstreamLineage.pipeline.fqnHash set in their Elasticsearch documents,
making them match the PIPELINE_AS_EDGE_KEY query during BFS traversal and incorrectly appear
alongside actual data assets. Fix: add .withPipeline(null) on each service/domain/dataProduct
LineageDetails object to strip the pipeline annotation before persisting.

Bug #2: "By Service" view was empty when viewing lineage for pipeline entities that were stored
as edge annotators (Case B: table → topic with pipeline=flink_pipeline in LineageDetails) rather
than as actual nodes (Case A). Root cause: addServiceLineage() only created database_service →
kafka_service edges but no edges involving flink_pipeline_service. Fix: add addPipelineServiceEdges()
called from addServiceLineage() that creates fromService → pipelineService and pipelineService →
toService edges when a pipeline annotation exists in the entity-level lineage details.

Also add unit tests covering both fixes to prevent regression.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): add migration to remove pipeline annotation from service/domain/dataProduct lineage edges

The previous fix (e6df7a6c62) prevented new lineage from inheriting pipeline annotations on
service/domain/dataProduct-level edges. However, existing data in the entity_relationship table
already has pipeline set on those edges from before the fix, and Elasticsearch reindex reads from
the DB — so reindex alone does not fix stale data.

This migration removes the pipeline field from all service-to-service, domain-to-domain, and
dataProduct-to-dataProduct lineage edges (relation=13/UPSTREAM) in entity_relationship.

After upgrading and running this migration, operators should trigger an Elasticsearch/OpenSearch
reindex so that the corrected DB records are reflected in the search index, which is what the
lineage graph BFS traversal reads from.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): move pipeline annotation migration from 1.12.0 to 1.13.0

Moves the data migration that removes the pipeline field from
service/domain/dataProduct lineage edges in entity_relationship to the
1.13.0 migration scripts, which is the correct target version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): move pipeline annotation migration from 1.13.0 to new 1.12.6

Creates a new 1.12.6 migration with the data fix that removes the pipeline
field from service/domain/dataProduct lineage edges in entity_relationship,
and removes it from 1.13.0 where it was previously placed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): add v1126 Java migration to create pipeline service edges for existing data

For installations upgrading to 1.12.6 with existing lineage data, service edges
fromService→pipelineService and pipelineService→toService were never created
(only added by the code fix for new lineage going forward). This migration
reads service-level lineage edges that have a pipeline annotation, resolves
the pipeline entity's service, and inserts the two missing service edges into
entity_relationship (DB only). After the SQL migration strips pipeline from
service edges and a reindex runs, the "By Service" lineage view for pipeline
services correctly shows their upstream/downstream service connections.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): fix v1126 migration to read entity-level edges for pipeline service creation

The original migration read service-level edges (databaseService→messagingService)
looking for pipeline annotations, but those had already been cleaned by the SQL
migration before the Java migration could run in subsequent server restarts.

Fix: read data-asset-level edges (table→topic etc.) which retain their pipeline
annotation permanently. For each such edge, resolve fromEntity.service,
toEntity.service, and pipeline.service, then create the two missing
pipelineService edges in entity_relationship.

Verified: after running the migration manually via direct SQL + OpenSearch update,
the By Service view for lineage_test_flink_svc correctly shows 3 nodes with
upstream (db_svc→flink_svc) and downstream (flink_svc→kafka_svc) edges.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): clean up pipeline service edges when entity lineage is deleted

When entity-level lineage (table→topic) is deleted, cleanUpExtendedLineage
only cleaned up fromService→toService (db_svc→kafka_svc) but left the new
pipeline service edges (db_svc→flink_svc, flink_svc→kafka_svc) as orphans
in both entity_relationship and OpenSearch.

Fix:
- Pass lineageDetails (which contains the pipeline reference) into
  cleanUpExtendedLineage from both deleteLineage and deleteLineageByFQN
- Add cleanUpPipelineServiceEdges that mirrors addPipelineServiceEdges:
  uses getPipelineService(lineageDetails) to resolve the pipelineService,
  then calls processExtendedLineageCleanup for fromService→pipelineService
  and pipelineService→toService edges (decrement assetEdges or delete+remove
  from search if count reaches zero)
- Also fix deleteLineageByFQN which was missing cleanUpExtendedLineage call
  entirely (pre-existing gap for service edge cleanup)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(lineage): add unit tests for pipeline annotation stripping and pipeline service edge creation

- Add 4 new unit tests to LineageRepositoryTest covering:
  - Bug #1 (2 tests): service-level edges do not inherit pipeline annotation
    from entity lineage, both for new and existing edges
  - Bug #2 (2 tests): addPipelineServiceEdges creates fromService→pipelineService
    and pipelineService→toService edges when pipeline annotator is present,
    and skips them when no pipeline is set
- Fix MySQL migration: add metadataService to entity type list (was in Java
  migration's SERVICE_ENTITY_TYPES but missing from SQL) and replace
  JSON_EXTRACT IS NOT NULL with JSON_CONTAINS_PATH to correctly handle both
  present and explicit-null pipeline fields
- Fix PostgreSQL migration: add metadataService to entity type list

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(lineage): add integration tests for pipeline-as-annotator lineage scenario

Tests Bug #1 (service nodes absent from entity-level lineage) and Bug #2
(pipeline service connected in service-level lineage) using a table → topic
edge annotated with a pipeline entity reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(e2e): add Playwright tests for pipeline-as-annotator lineage scenario

Tests Bug #1 (service nodes absent from entity-level lineage) and Bug #2
(pipeline service appears in service-level lineage) using API interception
and direct request assertions via page.request.get().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: apply spotless formatting to LineageRepositoryTest

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: apply prettier formatting to LineagePipelineAnnotator spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lineage): guard against null LineageDetails in getPipelineService

When the json column in entity_relationship is NULL, JsonUtils.readValue
returns null. getPipelineService now short-circuits on a null argument
instead of throwing NullPointerException via entityLineageDetails.getPipeline().

Fixes NPE in deleteLineageByFQN and deleteLineage cleanup paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(e2e): use authenticated apiContext for service lineage assertions

page.request.get() sends browser cookies but OpenMetadata authenticates
via JWT in localStorage, so those calls were unauthenticated (non-2xx).
Replace with getToken + getAuthContext pattern used elsewhere.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(migration): add driveService to 1.12.6 pipeline annotation cleanup

Directory, File, Spreadsheet, and Worksheet entities map to driveService,
so service-level lineage edges between driveService instances could also
have incorrectly inherited the pipeline annotation. Include driveService
in the 1.12.6 cleanup migration for both MySQL and PostgreSQL.

Also drops the stray trailing-newline changes from the 1.12.0 migration
files — those edits were unnecessary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* new line remove

* fix(migration): add DRIVE_SERVICE to v1126 SERVICE_ENTITY_TYPES set

driveService-to-driveService edges must be skipped during the pipeline
service edge migration scan, same as all other service-level edges.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(migration): resolve merge conflict in v1126 MigrationUtil

The rebase left MigrationUtil with duplicate imports and a missing closing
brace on insertEdgeIfMissing. Merged both method sets cleanly and ran
spotless.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

(cherry picked from commit c2e6d907dd)
2026-04-17 13:43:05 +05:30
Ram Narayan Balaji
60eb1642e6 fix(migration): revert webhook authType back to secretKey in v1126 and remove broken v1125 migration (#27427)
* fix(migration): add v1126 reverse migration to revert webhook authType back to secretKey

* fix(migration): remove migrateWebhookSecretKeyToAuthType from v1125 migration

* fix(test): remove migrateWebhookSecretKeyToAuthType references from v1125 migration tests

* fix(migration): address copilot review comments on v1126 migration

* fix(migration): case-insensitive bearer check and verify JSON content in v1126 tests

* fix(migration): remove unused constants from v1125 and add postgres path + SQL verification to v1126 tests

(cherry picked from commit 35ede8fe5f)
2026-04-16 14:04:26 +00:00
mohitdeuex
61ec751341 Migration 2026-04-09 15:46:22 +05:30
Ram Narayan Balaji
c59235fd6e Refactor(certification): store asset certification in tag_usage table (#26448)
(cherry picked from commit b9d8c08b5b)

Conflict resolution: 1.12.5 uses nullifyEntityFields instead of
FIELDS_STORED_AS_RELATIONSHIPS/serializeForStorage. Added setCertification(null)
to nullifyEntityFields and save/restore certification in store(), storeMany(),
and updateMany() to achieve equivalent behavior.
2026-04-01 10:24:47 +05:30
Ram Narayan Balaji
e62ebc95cf Move Migration to 1.12.4 from 1.12.3 (#26629)
Cherry picked from commit ee4f931
2026-03-24 17:43:59 +05:30
Ram Narayan Balaji
5933d52a15 Feat# Include Fields Filter in EventBased Workflows and CheckChangeDescription Node (#26230)
* Include Fields in EventBased Workflows - Initial Commit

* Update generated TypeScript types

* Fix Include fields to be a map of arrays, Introduce checkChangeDescriptionTask as a separate node

* Update generated TypeScript types

* Extract common code into field value extractor

* chore: apply changes

Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com>

* java checkstyle

* Fix Compilation errors

* Fix NPE bug

* Test fixes and improvements

* chore: apply changes

Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com>

* Schema Changes for include fields and check change description

* Update generated TypeScript types

* Fixed 4 valid code review issues: migration idempotency bug (preventing false failures on re-runs), empty pattern string vulnerability (preventing unintended filter bypasses),
  removed unused dead code method, and corrected Javadoc inconsistency from {} to [] notation.

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gitar <noreply@gitar.ai>
Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com>
Co-authored-by: Anujkumar Yadav <anujf0510@gmail.com>
(cherry picked from commit bb6a99b953)
2026-03-24 17:07:33 +05:30
Vishnu Jain
6387b742fe Move MCP OAuth and impersonation migrations to 1.12.4 2026-03-23 16:06:25 +05:30
Vishnu Jain
da51cda97b Mcp oauth (#25391)
* Add OAuth MCP

* Implement internal OAuth flow for MCP with database
   persistence

   This commit implements a redirect-free OAuth flow for the OpenMetadata MCP
   server that uses stored connector OAuth credentials internally, eliminating
   the need for external browser redirects.

   Key Features:
   - Internal OAuth authorization using stored connector credentials
   - Database persistence of OAuth tokens (survives container restarts)
   - Automatic token refresh when expired
   - PKCE support for authorization code flow
   - OAuth discovery metadata endpoint (RFC 8414)
   How It Works:
   1. Admin performs one-time OAuth setup via /api/v1/mcp/oauth/setup
   2. OAuth credentials (access token, refresh token) stored encrypted in database
   3. MCP clients connect without browser - server uses stored credentials internally
   4. Expired tokens automatically refreshed and re-persisted to database

   Tested With:
   - Snowflake OAuth (session:role:PUBLIC scope)
   - Container restart verification (credentials persist)
   - Automatic token refresh verification

* feat: Add MCP OAuth database persistence with repositories and DAOs

- Implement OAuthClientRepository, OAuthTokenRepository, OAuthAuthorizationCodeRepository
- Add DAO methods in CollectionDAO for OAuth entities
- Create database migration scripts for OAuth tables (oauth_client, oauth_access_token, oauth_refresh_token, oauth_authorization_code)
- Add Fernet encryption for tokens and client secrets
- Implement SHA-256 hashing for token lookups
- Add OAuth connector plugin system (Snowflake, Databricks)
- Add scope authorization and validation
- Update ConnectorOAuthProvider to use database persistence
- Add comprehensive tests for OAuth provider

* Add MySQL migration for MCP OAuth tables (v1.12.1)

- Create oauth_client, oauth_authorization_code, oauth_access_token, oauth_refresh_token tables
- Convert Postgres schema to MySQL syntax
- Add indexes for performance optimization
- Tables manually applied in this session, migration framework integration needed

* feat: Complete MCP OAuth implementation with critical fixes and MCP Inspector support

1. **Scope Validation Fix**
   - Set validScopes to null in McpServer to skip validation for connector-based OAuth
   - Modified RegistrationHandler to skip validation if validScopes is empty
   - Fixes: Client registration error "Invalid scope: api://apiId/.default"

2. **Metadata Endpoint URLs**
   - Fixed all OAuth discovery endpoints to include /mcp prefix
   - Updated OAuthHttpStatelessServerTransportProvider endpoint construction
   - Ensures proper OAuth metadata discovery

3. **Token Exchange Security**
   - Added client_id validation during token exchange
   - Added redirect_uri validation to prevent security vulnerabilities
   - Load authorization code from database for validation
   - Prevents authorization code interception attacks

4. **Time Unit Consistency**
   - Fixed deleteExpired methods to use seconds instead of milliseconds
   - Updated OAuthTokenRepository and OAuthAuthorizationCodeRepository
   - Enables proper cleanup of expired tokens and codes

5. **Authorization Code Loading**
   - Fixed loadAuthorizationCode to load all fields from database
   - Populates AuthorizationCode object with clientId, redirectUri, codeChallenge
   - Resolves: NullPointerException during token validation

6. **Connector Name Parameter Support**
   - Added connectorName field to AuthorizationParams
   - Extract connector_name from HTTP request in AuthorizationHandler
   - Priority: connector_name parameter > state (if not random hash) > default

7. **Default Connector Fallback**
   - Detect random hash in state parameter (64 hex chars for CSRF)
   - Default to test-snowflake-mcp connector for MCP Inspector testing
   - Enables MCP Inspector to work without manual URL modification

8. **MySQL Migration**
   - Added MySQL schema changes for OAuth tables
   - Matches PostgreSQL schema structure
   - Tables: oauth_clients, oauth_authorization_codes, oauth_access_tokens, oauth_refresh_tokens

9. **Documentation Cleanup**
   - Removed 12+ redundant and outdated documentation files
   - Created single comprehensive MCP_OAUTH_IMPLEMENTATION.md
   - Added .shell-fix-note for shell script compatibility guidance

10. **Test Script Organization**
    - Organized test scripts into scripts/mcp-oauth-tests/
    - Added test-default-connector.sh for testing with MCP Inspector
    - Preserved all OAuth flow testing scripts

- McpServer.java - Disabled scope validation for connector OAuth
- RegistrationHandler.java - Skip empty validScopes
- AuthorizationHandler.java - Extract connector_name parameter
- AuthorizationParams.java - Added connectorName field
- ConnectorOAuthProvider.java - Default connector logic, loadAuthorizationCode fix
- OAuthHttpStatelessServerTransportProvider.java - Fixed endpoints, added validations
- OAuthTokenRepository.java - Fixed time unit to seconds
- OAuthAuthorizationCodeRepository.java - Fixed time unit to seconds

- CollectionDAO.java - OAuth DAO registration
- DatabaseServiceRepository.java - Database service queries
- OAuthRecords.java - Database record types

- Deleted: 15+ outdated documentation files
- Deleted: Unused auth provider (OpenMetadataAuthProvider.java)
- Deleted: Unused OAuth callback servlet
- Added: Single comprehensive documentation file

 OAuth flow working end-to-end
 Client registration, authorization, token exchange successful
 Database persistence for all OAuth entities
 MCP Inspector compatibility with default connector
 Snowflake OAuth credentials configured for testing

⚠️ MCP Inspector SSE connection error (under investigation)
   - OAuth authentication completes successfully
   - Issue is with MCP protocol SSE connection, not OAuth

Run MCP Inspector:
```bash
npx @modelcontextprotocol/inspector http://localhost:8585/mcp
```

Test with default connector:
```bash
./test-default-connector.sh
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Add CORS preflight support and security fixes for MCP OAuth

Allow OPTIONS requests without authentication in McpAuthFilter to support
CORS preflight checks from web-based MCP clients.

This enables proper CORS flow:
1. Browser sends OPTIONS preflight
2. Server responds with CORS headers (200 OK)
3. Browser sends actual POST request with Authorization header
4. Server authenticates and processes request

Without this fix, OPTIONS requests were blocked with 401, preventing
web clients from connecting to MCP endpoints.

1. **Sensitive Token Logging** (95% severity)
   - Sanitize OAuth request parameters before logging
   - Remove client_secret, code, code_verifier, refresh_token, access_token from logs
   - Prevents credential leakage in log files

2. **Token Expiry Integer Overflow** (100% severity)
   - Changed all expiry timestamps from int/Integer to long/Long
   - Fixes 2038 problem (32-bit timestamp overflow)
   - Updated: AccessToken, RefreshToken, AuthorizationCode, ConnectorOAuthProvider, OAuthTokenRepository

3. **Hardcoded Default Connector** (80% severity)
   - Made default connector configurable via MCP_DEFAULT_CONNECTOR env var
   - Defaults to null in production (requires explicit connector_name)
   - Prevents unauthorized access to test credentials in production

4. **Missing Null Checks** (85% severity)
   - Added validation for token refresh response fields
   - Validates access_token and expires_in exist before use
   - Added bounds checking for expires_in (max 1 year)

5. **Missing Input Validation** (75% severity)
   - Added connector name format validation
   - Only allows: a-z, A-Z, 0-9, _, - characters
   - Prevents path traversal and injection attacks

- Moved MCP docs to organized structure: openmetadata-mcp/docs/
- Created openmetadata-mcp/README.md with foundation documentation
- Moved implementation guide and testing guide to docs/ directory

- Removed development test scripts (scripts/mcp-oauth-tests/)
- Removed .shell-fix-note and test-default-connector.sh
- Kept only clean final test script: test-mcp-with-token.sh

Changes:
- openmetadata-mcp/src/main/java/org/openmetadata/mcp/McpAuthFilter.java: OPTIONS CORS support
- openmetadata-mcp/src/main/java/org/openmetadata/mcp/server/transport/OAuthHttpStatelessServerTransportProvider.java: Sanitized logging
- openmetadata-mcp/src/main/java/org/openmetadata/mcp/server/auth/provider/ConnectorOAuthProvider.java: Multiple security fixes
- openmetadata-mcp/src/main/java/org/openmetadata/mcp/McpServer.java: Configurable default connector
- openmetadata-mcp/src/main/java/org/openmetadata/mcp/auth/*.java: Long timestamps
- openmetadata-mcp/src/main/java/org/openmetadata/mcp/server/auth/repository/OAuthTokenRepository.java: Long timestamps

Testing:
- OAuth flow:  Working with any OAuth-enabled connector
- MCP protocol:  Working via HTTP POST with JWT
- Default connector: Configurable via MCP_DEFAULT_CONNECTOR env var
- General solution: Works with ANY connector with OAuth credentials

Test command:
export MCP_DEFAULT_CONNECTOR=test-snowflake-mcp  # For testing only
./test-mcp-with-token.sh

* feat: MCP OAuth security hardening and production readiness

Implemented security improvements and production configuration for MCP OAuth:

- Added constant-time secret comparison to prevent timing attacks
- Implemented token logging sanitization to protect sensitive credentials
- Fixed timestamp overflow (Integer → Long) to prevent 2038 issues
- Added input validation for connector names
- Implemented HttpClient resource cleanup (AutoCloseable)
- Added token refresh response validation with null checks
- Replaced hardcoded base URL with dynamic SystemRepository configuration
- Fixed MCP Inspector compatibility (removed unimplemented logging capability)
- Added example credential files and test setup documentation
- Removed commented code and unused files for cleaner codebase

Security TODOs documented for future work:
- Race condition in authorization code exchange (requires DB schema changes)
- Rate limiting for OAuth endpoints (requires new infrastructure)

Testing:
- All changes tested with Snowflake OAuth connector
- MCP Inspector connection verified working
- Code formatted with spotless

Breaking Changes: None

* fix: Address security vulnerabilities from code review bots

Implemented fixes based on automated code review bot findings:

**Critical:**
- SSRF prevention: Added URL validation in OAuthSetupHandler to block private IPs and validate schemes
- ThreadLocal leak: Added try-finally cleanup in doGet() to prevent auth context leakage

**High:**
- Removed hardcoded JWT tokens and client secrets (replaced with dynamic UUIDs)
- Added warning logs for missing connector names to improve auditability

Security impact: Prevents internal network access, credential exposure, and auth state leakage.

Testing: All changes formatted with spotless and validated.

* fix: Optimize SSRF prevention per code review bot recommendations

Improved SSRF mitigation based on detailed bot feedback:

**Optimization:**
- Refactored validateTokenEndpoint() → validateAndResolveTokenEndpoint()
- Returns validated URI object to avoid double parsing
- Integrates endpoint resolution and validation in single method
- Reuses URI throughout method to prevent inconsistencies

**Implementation Details:**
- Validates URL scheme, host, and IP ranges
- Blocks private IPs (10.x, 192.168.x, 172.16-31.x)
- Blocks link-local addresses (169.254.x)
- Validates before HTTP request and credential storage

**Benefits:**
- More efficient (single URI parse instead of two)
- Safer (validated URI reused consistently)
- Cleaner code (DRY principle)

Based on GitHub Copilot autofix suggestion for SSRF vulnerability.

* fix(mcp-oauth): Critical security fixes per code review bots

- SSRF: Add DNS resolution and validate all resolved IPs for token endpoints
- Race condition: Atomic authorization code exchange prevents replay attacks
- Refresh token: Fix expiry check using ofEpochSecond instead of ofEpochMilli
- Remove unrelated ingestion yaml files from PR

Addresses: CodeQL, Copilot Autofix, Gitar bot feedback

* fix(mcp-oauth): Address bot feedback - security and code quality

- Remove shell scripts with hardcoded JWT tokens from PR (added to .gitignore)
- Fix admin fallback: Use ingestion-bot instead of admin for security
- Fix connector name validation: Fail refresh if connector name missing
- Add TODO comments for hardcoded localhost URIs (requires MCPConfiguration wiring)

Addresses bot feedback on security concerns and configuration flexibility

* fix: SSRF - reconstruct URI from validated components

* fix: CodeQL suppression, Y2038 bug, test provider safeguards

* MCP OAuth: implement CORS development mode detection and token cleanup scheduler

- Add development mode detection for CORS origins based on baseUrl
  - Development: allow localhost origins with warning
  - Production: empty allowedOrigins (same-origin only) with warning
- Implement OAuth token cleanup scheduler with Quartz
  - OAuthTokenCleanupJob: deletes expired tokens and auth codes
  - OAuthTokenCleanupScheduler: runs cleanup hourly
  - Prevents unbounded token table growth

* fix: SSRF with allowlist and rate limiting

Use allowlist for OAuth endpoints, add rate limiting (10/5 req/min)

* fix: SSRF, OAuth security, and MySQL schema bugs

- SSRF: Remove user-provided tokenEndpoint, always infer from connector config using allowlist
- Schema: Fix MySQL table names (plural), authorization codes schema, add missing tables
- OAuth: Restore session redirect URI and re-enable nonce validation

* fix: Duplicate clientId variable and missing user_name column in Postgres migration

* security: Remove sensitive OAuth tokens and authorization codes from log statements

* security: Remove sensitive client metadata from registration logs

* chore: Remove connector OAuth infrastructure for user SSO implementation

* feat: Add MCP user SSO OAuth MVP implementation

- Updated database schema (MySQL + PostgreSQL) to use user_name instead of connector_name
- Removed connector OAuth infrastructure (plugins, ConnectorOAuthProvider)
- Created UserSSOOAuthProvider MVP skeleton with TODO markers
- Added comprehensive IMPLEMENTATION_TODO.md tracking all remaining work
- Added QUICK_START.md guide for setup instructions
- Added Claude Desktop configuration example
- Maintained backward compatibility with PAT authentication

See openmetadata-mcp/docs/IMPLEMENTATION_TODO.md for complete implementation checklist

* feat: Complete MCP OAuth SSO flow with database-backed state persistence

This commit implements a robust OAuth SSO flow for MCP server integration
that survives cross-domain redirects during SSO authentication (Google, etc).

Key changes:
- Add mcp_pending_auth_requests table for database-backed state storage
- Add McpPendingAuthRequestRepository for managing pending auth requests
- Add SSOCallbackServlet to handle SSO provider callbacks
- Add handleDirectIdTokenFlow for already-authenticated users (pac4j token flow)
- Add HtmlTemplates for secure error pages with XSS protection
- Add Claude Desktop OAuth bridge script for stdio transport integration
- Fix OIDC_CREDENTIAL_PROFILE constant shadowing issue
- Fix Postgres schema references to non-existent connector_name column
- Restore pac4j session attributes (State, Nonce, CodeVerifier) correctly

The solution stores OAuth state in the database instead of HTTP sessions,
which fail across cross-domain redirects due to SameSite cookie policy.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: Critical OAuth security fixes - thread safety, URL encoding, JWT validation, PKCE validation

* fix: Complete ThreadLocal migration for currentRequest.getSession()

* feat: Add development bypass for PKCE validation to enable local testing

* feat: Add OAuth support with ID token validation, refresh tokens, and security fixes

- Add JWKS-based ID token signature validation
- Implement refresh token generation and exchange with rotation
- Add redirect URI validation to prevent open redirect attacks
- Fix clock skew logic and time unit consistency
- Add comprehensive test coverage (15 tests)

* fix: Critical OAuth security fixes - client validation, redirect URI validation, error handling, Fernet decryption

- Add client ID validation in token exchange (prevents authorization code theft)
- Add redirect URI validation in token exchange (RFC 6749 Section 4.1.3)
- Fix time unit inconsistency in OAuthAuthorizationCodeRepository
- Improve error handling to distinguish replay attacks from expired codes
- Add user status validation in refresh token exchange
- Fix session regeneration to prevent session fixation attacks
- Add username/email validation in SSO callback handlers
- Improve Fernet decryption error handling for key rotation scenarios

All tests passing (15/15)

* fix: Clean up pom.xml - fix malformed dependency and remove duplicate dropwizard-jersey

* javacheck style fix

* fix: Addressing issues raised by Gitar code review

* fix: Merge McpAuthFilter changes - add impersonation support while preserving OAuth endpoints

* docs: Add comprehensive README for MCP OAuth implementation

* feat: Add MCP OAuth dynamic client registration

* feat: Add OAuth token revocation endpoint (RFC 7009)

* fix: OAuth basic auth flow - auto-redirect with code and optional scope enforcement

* feat: Match MCP auth page design to OpenMetadata signin UI

* fix: Support separate callback URLs for MCP OAuth and web login flows

* feat: Add OAuth scope enforcement, domain validation and session handling for MCP

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: Improve MCP OAuth login UI and add TODO for success page

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: MCP OAuth cleanup - security fixes, remove redundant scope system, improve error handling

- Fix timing attacks in CSRF and PKCE validation using MessageDigest.isEqual()
- Remove redundant @RequireScope system (OpenMetadata Authorizer handles permissions)
- Make OAuth scopes provider-aware (Google/Okta/Azure)
- Add baseUrl config to MCPConfiguration for cluster deployments
- Delete duplicate RootOAuthEndpointsResource (handled by OAuthWellKnownFilter)
- Fix silent failures: propagate errors instead of returning null/200
- Downgrade excessive logging to DEBUG level

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update generated TypeScript types

* fix: Move OAuth migrations from 1.12.1 to 1.12.0

- Consolidate OAuth schema tables into 1.12.0 migration
- Add Snowflake backward compatibility migration to 1.12.0
- Remove empty 1.12.1 migration folder
- Update README with security enhancements and permission model

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: critical OAuth security and reliability issues

Fix ThreadLocal leak, atomic token rotation, PKCE validation, fail-closed error handling, and password sanitization

* fix: URL encode authorization code

* fix: MCP OAuth stateless transport compatibility and SSO initialization reliability

* feat: Add MCP configuration to database settings system

- Create mcpConfiguration.json schema for MCP-specific settings
- Add MCP_CONFIGURATION to SettingsType enum
- Add MCP configuration bootstrap logic to SettingsCache
- Extend SecurityConfigurationManager with MCP config support
- Add mcpConfiguration field to OpenMetadataApplicationConfig
- Update MCPConfiguration.java with timeout settings and comments

* feat: Complete McpServer dynamic configuration resolution

- Add getBaseUrlFromConfig() to read from SecurityConfigurationManager with fallback
- Add getAllowedOriginsFromConfig() for database-backed CORS configuration
- Remove hardcoded baseUrl and CORS origins initialization
- Remove System.setProperty for HTTP timeouts (will be handled per-request)
- Fix SSO handler to use dynamic resolution via getInstance()
- Fix NoSuchAlgorithmException import in UserSSOOAuthProvider
- All configuration now comes from database via SecurityConfigurationManager

* Update generated TypeScript types

* feat: Add database-backed MCP configuration with dynamic reload

- Add GET/PUT /api/v1/system/mcp/config API endpoints for MCP configuration management
- Refactor SSOCallbackServlet to read claims/domains/validators dynamically from SecurityConfigurationManager
- Add configuration reload support to OAuthHttpStatelessServerTransportProvider (volatile allowedOrigins, updateAllowedOrigins method)
- Implement ConfigurationChangeListener pattern in SecurityConfigurationManager for component notification
- Add HTTP timeout configuration (connectTimeout/readTimeout) to AuthenticationCodeFlowHandler from MCP config
- All configuration stored in open_metadata_settings table with SecurityConfigurationManager as single source of truth

* fix: Add volatile config fields, CopyOnWriteArrayList, null checks, and correct HTTP timeout properties

* Remove hardcoded OAuth credentials and unrelated Snowflake migration

* Fix HTTP timeout system properties and session regeneration null check

* Implement cluster polling, DB-first loading, listener pattern, and fix race conditions

* added unit tests

* removed connector OAuth code

* updated readme

* fix: MCP OAuth cleanup — security fixes, migration move, and code quality

- Move OAuth SQL migrations from 1.12.0 to 1.12.1 (release target)
- Fix XSS in auth error page (no longer reflects exception messages into HTML)
- Fix CSRF bypass in state validation (throw instead of return-after-write)
- Fix token expiration check in BearerAuthenticator (millis vs seconds mismatch)
- Require S256 code_challenge_method explicitly (reject null/plain)
- Fix GetLineageTool: use VIEW_BASIC auth, add input validation, use singleton LineageRepository
- Rename SESSION_GOOGLE_CALLBACK_URL to SESSION_SSO_CALLBACK_URL (provider-agnostic)
- Remove 10-second config polling from SecurityConfigurationManager (use SettingsCache TTL)
- Remove unnecessary synchronized on volatile field getters
- Downgrade verbose LOG.info calls to LOG.debug (session state, admin principals, tokens)
- Fix FQN imports in AuthenticationCodeFlowHandler (MCPConfiguration, Role)
- URL-encode redirect parameters (id_token, email, name)
- Remove invalid "default": null from defaultOAuthRole JSON schema
- Add error logging in AuthorizationHandler.exceptionally() block

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* add TODOs for unfixed security review findings

* fixed critical review issues: added client_secret validation, registration rate limiting, session regeneration bug, exact path matching, dead code removal

* fixed auth filter 500→401 for invalid tokens, exact path matching in transport provider

* added revocation client auth, redirect URI scheme validation, ID token validation in SSO flow, rate limiter race fix, downgraded PII logging to DEBUG

* fix MCP config loading to use getSettingOrDefault, cache IdTokenValidator

* google sso login working here

* add basic auth login flow for MCP OAuth, fix web UI redirect_uri_mismatch

* revert cosmetic UI formatting changes accidentally introduced in merge

* fix CodeQL info exposure and GitarBot security findings: redirect_uri validation, pac4j race condition

* harden MCP OAuth: fix error handling, remove dead code, prevent info leaks

* remove dead code and harden MCP OAuth: delete 5 unused files, inline metadata handlers, add PKCE validation, fix error handling

* fix GitarBot findings: restrict HTTP redirects to loopback, add token rate limiting, restore GET 405, deny-all CORS fallback, reduce JWK cache TTL

* fix Azure SSO: always register callback servlet, use baseUrl for token exchange, show success page

* security hardening: early user check, ID token audience validation, token rotation, shorter JWT TTL

* LDAP support, allow native app redirect schemes, tolerate unknown registration fields

* fix open redirect in MCP callback detection, check auth code expiry before consumption, warn on fallback baseUrl

* null safety for PKCE, grant_type, and refresh_token params in token endpoint

* fix RevocationHandler test exception type mismatch

* add registration metadata length validation, fix loopback host check

* fix MCP OAuth SSO callback for Okta: use registered redirect_uri, fix pac4j session attribute names, forward /callback to /mcp/callback

* fix missing return in MCP callback error path, skip SSO registration for basic/ldap, improve comment

* MCP OAuth security hardening: bcrypt secrets, atomic CAS rotation, XFF rate limiting, review fixes

* fix XFF rate-limit bypass: validate IP format, cap map size to prevent heap exhaustion

* move MCP OAuth migrations from 1.12.2 to 1.12.3, remove unused oauth_audit_log table, simplify

* fix client_secret_basic removal, MySQL index idempotency, token auto-delete on decrypt failure

* Update generated TypeScript types

* Update generated TypeScript types

* fix impersonation compatibility after McpAuthFilter deletion

* hash authorization codes with SHA-256 before storing in DB

---------

Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2026-03-23 16:01:21 +05:30
Ram Narayan Balaji
df458dceb4 Revert "Feat# Include Fields Filter in EventBased Workflows and CheckChangeDescription Node (#26230)"
This reverts commit ddac3661d9.
2026-03-11 23:04:21 +05:30
Ram Narayan Balaji
ddac3661d9 Feat# Include Fields Filter in EventBased Workflows and CheckChangeDescription Node (#26230)
* Include Fields in EventBased Workflows - Initial Commit

* Update generated TypeScript types

* Fix Include fields to be a map of arrays, Introduce checkChangeDescriptionTask as a separate node

* Update generated TypeScript types

* Extract common code into field value extractor

* chore: apply changes

Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com>

* java checkstyle

* Fix Compilation errors

* Fix NPE bug

* Test fixes and improvements

* chore: apply changes

Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com>

* Schema Changes for include fields and check change description

* Update generated TypeScript types

* Fixed 4 valid code review issues: migration idempotency bug (preventing false failures on re-runs), empty pattern string vulnerability (preventing unintended filter bypasses),
  removed unused dead code method, and corrected Javadoc inconsistency from {} to [] notation.

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gitar <noreply@gitar.ai>
Co-authored-by: yan-3005 <yan-3005@users.noreply.github.com>
Co-authored-by: Anujkumar Yadav <anujf0510@gmail.com>
(cherry picked from commit bb6a99b953)
2026-03-11 07:15:18 +00:00
Trang Nguyen [INT-DE]
1646ef0b01 Fixes #26225: Add index and FORCE INDEX for listLastTestCaseResultsForTestSuite (MySQL) (#26235)
* ISSUE-26225: add index idx_entity_timestamp_desc for data_quality_data_time_series

* ISSUE-26225: add index idx_entity_timestamp_desc for data_quality_data_time_series

* Update bootstrap/sql/migrations/native/1.12.2/mysql/schemaChanges.sql

* ISSUE-26225: fix the suggestion

---------

Co-authored-by: Teddy <teddy.crepineau@gmail.com>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
(cherry picked from commit de2e703fdd)
2026-03-09 07:03:15 -07:00
Pere Miquel Brull
51c9b6af4a Revert "Rename app 'preview' property to 'enabled' (#26170)"
This reverts commit e6e15a1120.
2026-03-06 07:24:38 +01:00
Pere Miquel Brull
e6e15a1120 Rename app 'preview' property to 'enabled' (#26170)
* Rename app 'preview' property to 'enabled' with inverted semantics

The 'preview' property was confusing: preview=false meant the app CAN
be used. Replace with 'enabled' where enabled=true means usable, which
is much more intuitive.

Changes across the full stack:
- JSON schemas: preview (default false) → enabled (default true)
- Java backend: isPreview/raisePreviewMessage → isEnabled/raiseNotEnabledMessage
- TypeScript types: preview → enabled
- Frontend component: isPreviewApp → isAppDisabled (checks enabled===false)
- SQL migrations for 1.11.12: rename + invert boolean in apps_marketplace
  and installed_apps tables (MySQL and PostgreSQL)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update generated TypeScript types

* format

* improve deletion process for disabled apps

* improve deletion process for disabled apps

* improve deletion process for disabled apps

* improve deletion process for disabled apps

* format

* fix tests

* migration

* migration

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-05 10:41:08 +01:00
Sid
1f4e24d953 fix glossary status frontend filtering logic to move to backend (#25428)
* fix glossary status

* add glossaryTerm spec

* fix: improve ListFilter implementation in list filtering logic

Co-authored-by: siddhant1 <siddhant1@users.noreply.github.com>

* reset main backend

* reset backend

* fix be

* rever

* spottless

* Fix GlossrayTerm search api endpoint

* status enum validation

* fix spec

* Replace quotes, validate enum

* bind param queries

* Move migrations to 1.12.0

* fix api docs

* optimize performance of fallback , refactoring

* fix ListFilter

* GlossaryTermService.java cleanup

* address gitar-bot feedback

* add entityStatus param in list api

* add entityStatus param in list api

* Send entityStatus param with both search and list glossary term APIs

- Pass entityStatus to searchGlossaryTermsPaginated and
  getFirstLevelGlossaryTermsPaginated when a specific status filter
  is active (not 'all')
- Keep 'All' option in status dropdown with default selection of
  Approved, Draft, InReview
- Show appropriate empty state message when status filter returns
  no results

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* update list API path (ListFilter.getEntityStatusCondition) to validate against the enum, in case if an invalid value like "Bogus" is passed

* fix playwright

* Fix rejected glossary term staying visible in listing

Remove rejected terms from visible list when status filter excludes
them, and fix reused waitForResponse promise in Playwright test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* add initian load

* Fix Expand All ignoring active status filter and add E2E tests

Pass entityStatus parameter in fetchExpadedTree so Expand All respects
the active status filter. Add E2E test suite to verify the behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Rewrite Glossary Expand All E2E tests to follow Playwright handbook patterns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix flaky GlossaryPagination test by scoping locators to glossary table

Scoped unscoped `tbody .ant-table-row` locators to `glossary-terms-table`
testid, and replaced unreliable row count assertion in empty state test
with visibility checks on `no-data-placeholder`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Siddhant <siddhant@MacBook-Pro.local>
Co-authored-by: Gitar <noreply@gitar.ai>
Co-authored-by: siddhant1 <siddhant1@users.noreply.github.com>
Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com>
Co-authored-by: Ram Narayan Balaji <81347100+yan-3005@users.noreply.github.com>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
Co-authored-by: Siddhant <siddhant@MacBook-Pro-3.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Siddhant <siddhant@MacBook-Pro-4.local>
(cherry picked from commit 12d85f310f)
2026-03-05 10:37:35 +05:30
Mayur Singal
6db17aa2dd Fix #26178: Add support for IAM auth for redshift (#26179)
* Fix #26178: Add support for IAM auth for redshift

* Missing files for the implementation

* Update generated TypeScript types

* adderess guitar comments

* address comments

* fix python tests

* fix redshift playright

* fix checkstyle

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-04 10:51:58 +05:30
harshsoni2024
7f10c5d135 Fix-20713: Add support for metadata ingestion using local file in REST connector (#26036) 2026-03-02 11:02:11 +05:30
Himanshu Khairajani
7b16e74e32 Openlineage: Added Kinesis Support #24752 (#26050)
* Openlineage Kinesis Support

* Update generated TypeScript types

* marking field as required

* test-connection name improvement

* pagination improvement

* test-connection name improvement

* Update generated TypeScript types

* nested broker-config migration file

* newline added to yaml

* Migration to 1.11.2

* Migration to 1.11.12*

* fix: add throttle mechanism to kinesis get_records loop

Co-authored-by: Khairajani <Khairajani@users.noreply.github.com>

* fix: prevent timeout reset on sequential shard polling

Co-authored-by: Khairajani <Khairajani@users.noreply.github.com>

* Kinesis test-case

* Kinesis test-case

* setting lineageInformation object model and not raw dict

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gitar <noreply@gitar.ai>
Co-authored-by: Khairajani <Khairajani@users.noreply.github.com>
(cherry picked from commit cf0fa0a519)
2026-02-26 08:52:03 +00:00
Mohit Yadav
d5783dd021 Optimize indexing Processing to EsDoc (#26079)
* Optimize Reads with Keyset

* Optimize Search Index Processing stage

* Fix KeySet Cursor

* revert keyset for time series

* Fix Review Comments

* Move to 1.12.2

* Fix Review Comment

* Remove IF NOT EXISTS from mysql and update common mthod

(cherry picked from commit 82b9d34806)
2026-02-26 09:56:48 +05:30
Pere Miquel Brull
f056f9edd4 MINOR - Allow app definition to pass the impersonation rules for bots (#25909)
* MINOR - Streamline bot impersonation from apps

* MINOR - Streamline bot impersonation from apps

* MINOR - Streamline bot impersonation from apps

* MINOR - Streamline bot impersonation from apps

* Update generated TypeScript types

* policy flag

* policy flag

* policy flag

* policy flag

* fix feedback

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-02-17 19:53:26 +01:00
Ram Narayan Balaji
91239164f5
delete workflow instance entries if status is null in migration (#25867) 2026-02-13 16:00:35 +05:30
Ram Narayan Balaji
f418203338
Fix: Resolve v1.12.0 migration failure due to NULL workflow status (#25834)
* Fix: Resolve v1.12.0 migration failure due to NULL workflow status

  ## Root Cause Analysis
  - Migration failed when modifying entityLink column in workflow_instance_time_series
  - MySQL's ALTER TABLE MODIFY COLUMN re-validates ALL generated columns for ALL rows
  - Found 184+ workflow instances created between Dec 2024 - Jan 2025 with NULL status
  - These were created with pre-v1.7.0 code that didn't set status field in JSON
  - v1.7.0 added status column as GENERATED NOT NULL but old instances had NULL values
  - v1.12.0 migration triggered constraint validation, causing "Column 'status' cannot be null"

  ## Solution
  - Add UPDATE statements before ALTER TABLE in v1.12.0 migration
  - Set status='FINISHED' for workflows with endedAt (completed)
  - Set status='FAILED' for workflows without endedAt (incomplete)
  - Use two separate queries for better performance vs CASE statements
  - Handle both workflow_instance_time_series and workflow_instance_state_time_series

* failed to FAILURE status
2026-02-12 19:32:57 +05:30
Sriharsha Chintalapani
b244798f22
Add bulk apis for pipeline status (#25731)
* Add bulk apis for pipeline status

* Update generated TypeScript types

* Fix gitar comments

* Update generated TypeScript types

* Fix pycheck

* Address comments

* Fix databricks test

* Move schema changes to 1.11.9

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
2026-02-10 18:14:06 +05:30
Sriharsha Chintalapani
6f577656c1
Fix integration tests (#25753)
* Fix - disk space in github workflows

* Fix - disk space in github workflows

* Fix - disk space in github workflows

* Fix running tests with bulk apis

* Fix running tests with bulk apis

* Address comments; make awaitability for tests

* Address comments
2026-02-08 21:16:28 -08:00
sonika-shah
30a4d32720
Fix entity version history of dataProducts after removing inputPorts/ field (#25702) 2026-02-05 11:59:24 +05:30
Aleksei Sviridkin
b2ac6f70d9
Fixes #24546: Add sobjectNames field for multi-object selection in Salesforce connector (#24547)
* feat(salesforce): add sobjectNames field for multi-object selection

Add support for specifying multiple Salesforce objects to ingest
instead of just one or all. The new `sobjectNames` array field
allows users to select specific objects (e.g., Contact, Account,
Lead) without having to ingest all objects and filter them.

Priority order:
1. sobjectNames (array) - if specified, use only these
2. sobjectName (string) - if specified and sobjectNames empty
3. All objects from describe() - if neither specified

tableFilterPattern applies in all cases as a final filter.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>

* refactor: removed sobjectName field and added a migration for 1.11.8 to migrate sobjectName values to sobjectNames

* fix: sobjectNames priority comment

* refactor: sobjectNames changes in ts files

* fix: yaml structure in test_salesforce

* fix: test_salesforce.py - metadata as OpenMetadata object

* fix: added new line in sql migrations

* fix: sql migration serviceType

---------

Signed-off-by: Aleksei Sviridkin <f@lex.la>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Keshav Mohta <keshavmohta09@gmail.com>
Co-authored-by: Keshav Mohta <68001229+keshavmohta09@users.noreply.github.com>
Co-authored-by: Mohit Yadav <105265192+mohityadav766@users.noreply.github.com>
2026-02-02 16:05:59 +01:00
Ajith Prasad
f1fe02daff
Moved AI Application and LLM Model entities migrations to 1.12.0 (#25659) 2026-02-02 08:50:37 +01:00
Himanshu Khairajani
e86a0201ab
Fix #25645: MySQL timestamp precision for tag_usage.appliedAt (#25643)
* Fix MySQL timestamp precision for tag_usage.appliedAt

MySQL's TIMESTAMP type defaults to second precision, while PostgreSQL
returns microsecond precision. This causes _normalize_datetime_strings
in the Python ingestion client to produce spurious appliedAt diffs in
JSON patches, which then fail with "Failed to convert JsonValue to
target class" during deserialization in JsonUtils.applyPatch().

Upgrade appliedAt to TIMESTAMP(6) to match PostgreSQL behavior and
eliminate the spurious patch diffs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add 1.11.8 migration for MySQL appliedAt timestamp precision

Backport the TIMESTAMP(6) fix to the 1.11.x release line so existing
deployments on 1.11.x pick up the fix without requiring a 1.12.0 upgrade.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 12:46:19 +01:00
sonika-shah
cec1829645
Fix DataProduct inputPorts/outputPorts orphaned fields migration issue after migration from 1.10.x to 1.12.x (#25634)
* Fix DataProduct inputPorts/outputPorts orphaned fields migration issue after migration from 1.10.x to 1.12.x

* escape ? as ?? for JDBI
2026-01-30 15:26:48 +05:30
mohitdeuex
fcc0c1d944 Drop constraint from postgres 2026-01-29 22:24:45 +05:30
Mohit Yadav
21750aaa90
Feature/search indexing issues (#25594)
* Add design doc for search indexing stats redesign

Covers:
- Simplified 4-stage pipeline model (Reader, Process, Sink, Vector)
- Per-entity index promotion instead of batch promotion
- Alias management from indexMapping.json
- Payload-aware vector bulk processor

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add Support for Per Entity Index Promotion

* Add UI Bit

* Add Lang

* Add AppLog View Test coverage

* Add Bathced Vector index querying

* Add Improvements for Vector to be async and also stats to be better handled

* Use Virtual Thread

* Use Virtual Thread

* Fix Tests

* Make reading stats easier

* Fixed Stats to be accurate

* Fix Stats getting null

* Fix partition worker stats

* Fix Reader Stats - final

* Update generated TypeScript types

* Make updates in 1.12.0

* Revert "Use Virtual Thread"

This reverts commit 4eb23374d1.

* Revert "Use Virtual Thread"

This reverts commit efe8d03b5d.

* Reapply "Use Virtual Thread"

This reverts commit d59cde18b2.

* Reapply "Use Virtual Thread"

This reverts commit 769e5710c3.

* Fix Final Update on stat

* - Add atomic alias swap
- remove unnecessary migration

* Fix Sonar test jest

* Fix Final Update on stat

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-29 18:50:39 +05:30
Mohit Yadav
0129f274ed
ReApply changes Fix Stats Issue and Add Tests (#25521)
* Fix Issue and Add Tests

* Update generated TypeScript types

* Fix CI jest failure

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-26 21:10:23 +05:30
Teddy
9e77872972
ISSUE #25482 - rule library validator implementation (#25497)
* feat(rule library): expend safe token

* feat(rule library): added validator class to testDefinition

* chore

* feat(rule library): implement validator logic

* feat(rule library): fix runtime errors

* feat(rule library): implement table level rule library

* feat(rule library): implement integration test for rule library

* feat(rule library): ran python linting

* feat(rule library): fix wrong import

* feat(rule library): added logic to catch template error

* feat(rule library): fix test to handle new validator class behavior

* feat(rule library): fix test to handle new validator class behavior
2026-01-25 16:58:38 +01:00
Sriharsha Chintalapani
b09f4828c4
Learning Resources (#25005)
* Add Learning Resources with-in product

* Translations

* Add Learning Resources in-line with-in product

* Add Learning Resources in-line with-in product

* Potential fix for code scanning alert no. 1844: Incomplete URL substring sanitization

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Update generated TypeScript types

* Update the design

* Update the design

* Add leanring resources

* Update generated TypeScript types

* Add learning resources

* Update generated TypeScript types

* Address comments

* Address comments

* fixed build issue

* fix java checkstyle

* fixed initital bugs

* fixed less file name

* resolve conflict

* fixed failing unit test

* Address update issues, add more playwright tests

* Address update issues, add more playwright tests

* fixed code quality and updated all the missed pages with leanrning icon

* fixed invalid translation

* Added icon for rules library

* fixed unit tests

* replaced string with constants

* addressed comments

* resolved backend merge conflict

* removed plural label

* fixed header actions position

* fixed git-r comment

* added fixme to a test

* fixed label

* fixed flaky test

* Update generated TypeScript types

* removed playwright config file

* hide column view

* playwright fixes

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Parmar <83108871+dhruvjsx@users.noreply.github.com>
Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
2026-01-25 07:20:14 -08:00
Eugenio
ce007263ef
Improve TagLabel with rich metadata (#25472)
* Ensure columns are retrieved in the right order

This is because since introducing ordering for `getTableColumnsByFQN`, the patches created in `removeTagFromEntity` were open to pointing to different columns if the default order didn't match how they were persisted in db

* Allow exception list to be updated on all feedback

* Apply gitar comments

* Add `metadata` to `tag_usage` table

* Update JSON schema object to include `TagLabel.metadata`

* Apply feedback to selected recognizer

* Add backend integration tests

* Update `ingestion` to return `TagLabel.metadata.recognizer`

* Update generated TypeScript types

* Update generated TypeScript types

* Send recognizer result metadata in feedback approval task (#25485)

* Send `TagLabelRecognizerMetadata` in `TaskDetails`

This is so we can show an explanation behind the classification in the feedback approval card

* Update typescript types

* Run Spotless

* Ensure `applyTagsBatchInternal` works equally for pg and mysql

* Tag metadata fixes

* Fix CI test

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Rohit Jain <60229265+Rohit0301@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2026-01-24 10:09:06 +01:00
mohitdeuex
c006bdb2b0 Revert "Fix stats and Improve Search with Insights (#25495)"
This reverts commit 19725a7130.
2026-01-24 11:53:51 +05:30
Mohit Yadav
19725a7130
Fix stats and Improve Search with Insights (#25495)
* Fix Stats

* Add Warning logs and reindex failure analysis

* Add Search Insights in Preferences

* Add Label

* Fix Full Error not available

* Add check for reindex run
2026-01-24 10:27:46 +05:30
Pere Miquel Brull
6aa5a7f033
FIX #24374 - Data Contract at Data Product level (#25314)
* FIX #24374 - Data Contract at Data Product level

* Update generated TypeScript types

* FIX #24374 - Data Contract at Data Product level

* fix DP page

* fix: preserve termsOfUse object format in filtered contract

The termsOfUse field was being converted to a string during filtering,
but the form components expect it to be an object with {content: string}.
This was causing test failures where form elements were not visible.

- Keep termsOfUse as object format when not inherited
- Convert old string format to new object format for consistency
- Fixes 21 test failures in DataContracts.spec.ts and DataContractInheritance.spec.ts

* fix: address code review findings - state sync and immutability

Frontend changes:
- Add useEffect to sync formValues with filteredContract changes
- Ensures edit form updates when contract prop changes

Backend changes:
- Create deep copy at start of mergeContracts() to avoid mutating input
- Prevents side effects if contract object is reused elsewhere

Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>

* Addressing feedback

Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>

* fix tests

* fix inherited contract delete and status

* fix inherited contract delete and status

* fix inherited contract execution in app

* fix test

* fix: resolve playwright postgresql ci test failure

Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>

* ci: fix yaml validation and checkstyle failures

Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>

* fix: correct JSON/YAML validation errors

Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>

* fix: resolve maven-collate and ui-coverage test failures

Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>

* gitar feedback

* fix ci

* fix ci

* fix ci

* fix ci

* include .claude

* validate

* fix playwright

* playwright

* fix playwright

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gitar <gitar@collate.io>
Co-authored-by: Gitar <noreply@gitar.ai>
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: karanh37 <karanh37@gmail.com>
2026-01-23 07:01:53 +01:00
Sriharsha Chintalapani
89f627da81
Distributed Search Indexing with Push Notifications (#24939)
* Add Distributed Indexing in Multi-Server scenarios

* Add Distributed Indexing in Multi-Server scenarios

* Update generated TypeScript types

* Handle Servers leaving and joining

* Update generated TypeScript types

* spotless fix

* Refactor Code for Single Server and Multiple Server

* Add Metrics and Search Index Orphaned Cleanup

* Add Language

* Add Test settings

* Add Test data

* Add Test data

* Update generated TypeScript types

* Add Load Test for more entities

* Add Stats fix

* Add server information

* Fix Staging INdex unavailable to DistributedJobParticipant

* Fix Stats issue

* Align Tests

* Fix Stats and Error Handling

* participant stat fix

* Fix coordinator stats

* Add E2E failure tests

* Fix Stats for Reader and Sink

* Added flush for sinking stats

* Add language label

* Fix Entity Build Errors

* Missing commit

* Update generated TypeScript types

* Change runId to serverId

* Fix test failures

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Mohit Yadav <105265192+mohityadav766@users.noreply.github.com>
Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
2026-01-23 06:12:05 +05:30
Teddy
83143d5748
ISSUE #2032-CLT: Entity History Endpoint (#25410)
* feat: added repository logic to list all versions (including latest) for a specific entity type

* feat: added list all versions for all the entity resources

* feat: moved endpoint to EntityResource

* feat: renamed endpoint to /history and methods to EntityHistory

* feat: ran java linting

* feat: remove v1 implementation left over code

* feat: fix failing tests

* feat: ran klinting

* feat: fix psql query

* feat: address PR comments

* feat: ran klinting

* feat: increase cache duration

* feat: address query edge cases
2026-01-21 06:52:23 +01:00
harshsoni2024
44740ad5c5
Fix: remove overrideLineage config from database service metadata pipeline (#25379)
* remove overrideLineage from db metadata pipeline

* Update generated TypeScript types

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2026-01-20 09:08:26 +05:30
Teddy
2aac0b29ad
ISSUE #2652 - Freshness TZ (#25261)
* feat: add freshness tz support

* feat: added localization to handle DST

* style: fix code formatting and variable names

* style: ran python linting

* style: ram python linting

* style: fix linitng errors

* style: fix linting for GX based on version

* Fix: pass a string array to psql migration
2026-01-19 11:46:18 +01:00
Teddy
6cc7c24278
ISSUE #2681 - Add Missing test parameters in PSQL (#25323)
* fix(dq): psql migration for row insert test parameters

* fix(dq): use name and add trailing new line

* Fix description formatting in postDataMigrationSQLScript.sql

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-16 12:09:15 +01:00
Sriharsha Chintalapani
69ef1371bc
Rules library (#24748)
* Add DQ Rules Library

* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list

* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list

* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list

* Update generated TypeScript types

* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list

* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list

* Update generated TypeScript types

* Refactor tests to use toStrictEqual for string comparisons and improve consistency

- Updated various test files to replace `toBe` with `toStrictEqual` for string assertions in ImportStatus, SummaryCard, TabsLabel, and others.
- Enhanced regex tests to ensure accurate validation of entity names and tags.
- Added new translations for test platform warnings in en-us.json.
- Improved utility tests for alerts, authentication, CSV handling, and task messages to use `toEqual` for better clarity.

* Refactor TestDefinitionForm and TestDefinitionList components to use updated API methods and improve SQL expression handling

* Enhance TestDefinitionList component with permission checks for edit and delete actions, and update tests to reflect changes in permission handling

* Remove debug log from handleSubmit in TestDefinitionForm component

* Add permission loading state and enhance permission handling in TestDefinitionList component

* Update generated TypeScript types

* Update generated TypeScript types

* Update generated TypeScript types

* fix build failure

* Revert "Update generated TypeScript types"

This reverts commit 67b062216f.

* Enhance TestDefinitionForm and TestDefinitionList components with improved UI and pagination handling

* fix: update RulesLibrary tests and enhance TestDefinitionForm styling

* fix: Enhance TestDefinitionForm with error handling and improved UX

* fix: Update test definition handling and improve rendering in TestDefinitionList

* fix: Refactor TestDefinitionPermissions tests for improved permission checks and API context handling

* fix: Update system test definition retrieval to use findLast for improved accuracy

* feat: Add end-to-end tests for Rules Library and Test Definition Permissions

* fix: Update edit button visibility check to use beDisabled for better clarity

* fix: Refactor response handling in TestDefinitionPermissions tests for improved reliability

* move migrations execution order

* fix: remove existing columns

* style: remove migration extra line break

* chore: fix migration

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
Co-authored-by: TeddyCr <teddy.crepineau@gmail.com>
2026-01-14 08:12:30 +01:00
Pere Miquel Brull
1099379616
AI #200 - Add TRIGGER permission to application bots (#25113)
* AI #200 - Add TRIGGER permission to application bots

* Addressing feedback

Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>

---------

Co-authored-by: Gitar <noreply@gitar.ai>
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
2026-01-14 06:50:48 +01:00
Eugenio
c66d9eebf6
Tagging explanation (#24817)
* Added `appliedAt` field to `TagLabel`s

This is to track insertions to `tag_usage` with timestamps

* Capture and format recognition explanations in `TagAnalyzer`

This creates a function to build an explanation to why something was scored as it was.

# Conflicts:
#	ingestion/src/metadata/pii/algorithms/presidio_utils.py

* Refactor `TagProcessor`

* Capture results for the old-style `PIIProcessor`

* Move strings to constants

* Add `TagLabel.appliedBy` field

This change also patches the user's name into the tags declared in `JsonPatch` objects to fill it up

* Update typescript types

* Fix python tests

* Fix java tests

* Simplify setting tag's `appliedBy` using `EntityUpdater.updatingUser`

* Remove unnecessary f-string

* Moar fixes

* Move migrations to 1.11.5
2026-01-08 17:02:40 +01:00
Sriharsha Chintalapani
4c3f6dd1e3
Fix audit logs (#25127)
* Fix Audit Logs Migration; Add Improved UX for audit logs; Fix export async option

* Fix Audit Logs Migration; Add Improved UX for audit logs; Fix export async option

* Change UUID fields to type UUID from String in AuditLogs (#25119)

* Change UUID fields to type UUID from String

* Fix Row Mapper

* fix tests

* Reverted migrations to create and alter

* Revert "Reverted migrations to create and alter"

This reverts commit af71a454d7.

---------

Co-authored-by: Ram Narayan Balaji <81347100+yan-3005@users.noreply.github.com>
Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com>
2026-01-08 07:42:30 -08:00
Ajith Prasad
9dd364e207
Saml redirect Uri logic corrected (#24861)
* Saml redirect Uri logic corrected

* Added TCs for Saml AuthHandler

* Sidebar documentation improvement

* remove legacy SAML authenticator and merged it with generic authenticator

* remove saml_callback check

* Removed authority url from saml configuration

* Update generated TypeScript types

* Remove authority url from doc

* Added migration to remove saml authority url

* Added postgres migration fix

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-01-08 10:04:52 +05:30
Sriharsha Chintalapani
dca6256588
Audit logs (#23733)
* Add Audit Logs UI page

* Add Audit Logs UI page

* Update generated TypeScript types

* Adddress comments; Add more test coverage

* Update generated TypeScript types

* Fix gitar comments

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
2026-01-05 19:58:53 -08:00
Sriharsha Chintalapani
c62395b955
Fix #24578: Datamodels not visible if . in service name (#24779)
* Fix #24578: Datamodels not visible if . in service name

* Add migrations and tests

* Move migrations to 1.12.0
2025-12-27 10:00:26 -08:00
Bhanu Agrawal
71b23f1d24
Fix search percentile rank scoring (#24859)
* Fix search percentile rank scoring

* Added support for generic methods to merge search settings properties

* Added tests for search settings merge util

* Fixed palywright test for Restore default search settings
2025-12-23 18:06:27 +00:00