OpenMetadata/bootstrap/sql/migrations/native/2.0.0/mysql/schemaChanges.sql

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

358 lines
17 KiB
MySQL
Raw Permalink Normal View History

Task redesign (#25894) * Task Redesign: Add Task entity & tests * Task Redesign: Add Task entity & tests * Task Redesign: Add Permissions checks for Task APIs * Task UI changed to the new APIs * Migrate UI and APIs to new tasks system inlcuding suggestions * Add Suggestions integration * Activity Feed Refactor * ActivityFeed -> ActivityStream publisher * Activity Feed redesign * Activity Feed redesign, adding tests * Incident Manager update * Migrate Incidents to new tasks * Migrate Incidents to new tasks * Update generated TypeScript types * Update generated TypeScript types * feat(tasks): add domain-aware task cutover and workflow v2 migration * test(tasks): cover domain filters and task feed visibility flows * Address comments * Fix workflow tests to use new Task entity API and fix UserApprovalTaskV2 candidate transformation Migrated 9 WorkflowDefinitionResourceIT tests from legacy Feed/Thread API to the new Task entity API (UserApprovalTaskV2 creates Task entities, not Thread entities). Fixed a bug in UserApprovalTaskV2 where candidates were passed as raw EntityReferences instead of being transformed into users/teams FQN arrays for SetApprovalAssigneesImpl. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix tests * refactor: stabilize task entity workflows * refactor: finish task entity cutover and activity migration * refactor: migrate legacy thread feed during cutover * refactor: split legacy thread rename and archive migrations * Merge main; fix tests * Update generated TypeScript types * feat: advance task redesign through phase 2 * Merge main; fix tests * Update generated TypeScript types * Fix failing tests * Update generated TypeScript types * fininsh phase 6 of the design, configurable task forms * Update generated TypeScript types * Update generated TypeScript types * Fix linting * Address gitar comments * Address gitar comments * Fix build * Address giar comments * fix build * Add task custom forms * Fix tests * Address tests * Apply UI lint autofixes * Fix tess * Fix linter * Fix task patching * Fix tests * Fix playwright tests * fix java checkstyle * Add python sdk support for tasks, annoucements * Fix playwright tests * Fix playwright tests * Fix playwright tests * Fix python tests * Fix python tests * Fix linting workflows * fix pycheck * fix pycheck * Fix tests * Fix build * Address deviations from main and fix tests * Fix integration tests * Fix integration tests * Fix integration tests * Update generated TypeScript types * Fix Playwright tests * Fix Playwright tests * feat(incident): wire incident manager to task-first architecture (#27369) * feat(incident): wire incident manager to task-first architecture Connect the incident manager to the task redesign so it works end-to-end: resolve data persistence, backward transitions, reopen from resolved, and incident discovery via TCRS. * Update generated TypeScript types * refactor: single-query incident task lookup with parameterized statuses Replace two sequential queries (Open, InProgress) in getOrCreateIncident with one findByAboutAndTypeAndStatuses query using @BindList for status IN (...). --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix Playwright tests * Update generated TypeScript types * Fix linter * Fix tests * Fix tests * Fix checkstyle * Fix tests * Fix checkstyle * Update FeedResourceIT.java * Update TableRepository.java * fix tests * Update ActivityFeedProvider.tsx * fix tests * fix tests * Address Task comments * Fix unit test * Fix the feed summary panel showing on landing page * Fix comment functionality * Fix pytests * Fix failing playwright tests * Fix test flakiness * Fix ui-checkstyle * Fix advanced search spec failure * Fix playwright tests Co-authored-by: Copilot <copilot@github.com> * Fix checkstyle * Fix the flaky tests Co-authored-by: Copilot <copilot@github.com> * fix checkstyle * Reduce the workflow polling * Update generated TypeScript types * skip failing tests Co-authored-by: Copilot <copilot@github.com> * Fix ui-checkstyle --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com> Co-authored-by: IceS2 <pablo.takara@getcollate.io> Co-authored-by: karanh37 <karanh37@gmail.com> Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com> Co-authored-by: Copilot <copilot@github.com>
2026-04-23 13:52:30 +00:00
-- Task System Redesign - OpenMetadata 2.0.0
-- This migration creates the new Task entity tables and related infrastructure
CREATE TABLE IF NOT EXISTS task_entity (
id varchar(36) NOT NULL,
json json NOT NULL,
fqnHash varchar(768) NOT NULL,
taskId varchar(20) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.taskId'))) STORED NOT NULL,
name varchar(256) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.name'))) STORED NOT NULL,
category varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.category'))) STORED NOT NULL,
type varchar(64) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.type'))) STORED NOT NULL,
status varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.status'))) STORED NOT NULL,
priority varchar(16) GENERATED ALWAYS AS (COALESCE(json_unquote(json_extract(`json`,_utf8mb4'$.priority')), 'Medium')) STORED,
createdAt bigint GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.createdAt'))) STORED NOT NULL,
updatedAt bigint GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.updatedAt'))) STORED NOT NULL,
deleted tinyint(1) GENERATED ALWAYS AS (json_extract(`json`,_utf8mb4'$.deleted')) STORED,
aboutFqnHash varchar(256) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.aboutFqnHash'))) STORED,
createdById varchar(36) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.createdById'))) STORED,
feat(dar): Granted lifecycle, filters, sort, and self-service create policy (#28044) * feat(dar): add Granted lifecycle, filters, sort, and self-service create policy Splits the Data Access Request lifecycle into Approved (awaiting grant) and Granted (active access) so the UI can show an "approved – awaiting grant" banner that clears once an admin marks the request as granted. Adds an indexed approvedBy/approvedById/approvedAt on Task, captured at the approve transition through a new direct-persist helper. Introduces a dedicated /v1/tasks/dataAccessRequests endpoint pre-scoped to category=DataAccess with DAR filters (dataset, service, status, requestedBy, approver, accessType) and an asc/desc sort on createdAt; generic /v1/tasks gains service/approver filters too. DataConsumerPolicy now grants Create on resource=task so authenticated non-admins can file a DAR (fixes "operations [Create] not allowed"). Reworks the workflow handler so transitions whose targetTaskStatus is non-terminal (Approved, Granted) don't close the task, and updates CreateTask.isTerminalTaskStatus to allow advancing between Approved → Granted stages. Adds a new "active" statusGroup that includes the DAR lifecycle states while preserving the existing open/closed semantics that Glossary-style workflows depend on. Includes a Postgres + MySQL migration for the indexed approvedById generated column and integration coverage in DataAccessRequestIT spanning the new lifecycle, filters, sorting, approver capture, and the non-admin policy path. Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: anuj-kumary <anujf0510@gmail.com> Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com> Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2026-05-16 14:35:15 +00:00
approvedById varchar(36) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.approvedById'))) STORED,
Task redesign (#25894) * Task Redesign: Add Task entity & tests * Task Redesign: Add Task entity & tests * Task Redesign: Add Permissions checks for Task APIs * Task UI changed to the new APIs * Migrate UI and APIs to new tasks system inlcuding suggestions * Add Suggestions integration * Activity Feed Refactor * ActivityFeed -> ActivityStream publisher * Activity Feed redesign * Activity Feed redesign, adding tests * Incident Manager update * Migrate Incidents to new tasks * Migrate Incidents to new tasks * Update generated TypeScript types * Update generated TypeScript types * feat(tasks): add domain-aware task cutover and workflow v2 migration * test(tasks): cover domain filters and task feed visibility flows * Address comments * Fix workflow tests to use new Task entity API and fix UserApprovalTaskV2 candidate transformation Migrated 9 WorkflowDefinitionResourceIT tests from legacy Feed/Thread API to the new Task entity API (UserApprovalTaskV2 creates Task entities, not Thread entities). Fixed a bug in UserApprovalTaskV2 where candidates were passed as raw EntityReferences instead of being transformed into users/teams FQN arrays for SetApprovalAssigneesImpl. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix tests * refactor: stabilize task entity workflows * refactor: finish task entity cutover and activity migration * refactor: migrate legacy thread feed during cutover * refactor: split legacy thread rename and archive migrations * Merge main; fix tests * Update generated TypeScript types * feat: advance task redesign through phase 2 * Merge main; fix tests * Update generated TypeScript types * Fix failing tests * Update generated TypeScript types * fininsh phase 6 of the design, configurable task forms * Update generated TypeScript types * Update generated TypeScript types * Fix linting * Address gitar comments * Address gitar comments * Fix build * Address giar comments * fix build * Add task custom forms * Fix tests * Address tests * Apply UI lint autofixes * Fix tess * Fix linter * Fix task patching * Fix tests * Fix playwright tests * fix java checkstyle * Add python sdk support for tasks, annoucements * Fix playwright tests * Fix playwright tests * Fix playwright tests * Fix python tests * Fix python tests * Fix linting workflows * fix pycheck * fix pycheck * Fix tests * Fix build * Address deviations from main and fix tests * Fix integration tests * Fix integration tests * Fix integration tests * Update generated TypeScript types * Fix Playwright tests * Fix Playwright tests * feat(incident): wire incident manager to task-first architecture (#27369) * feat(incident): wire incident manager to task-first architecture Connect the incident manager to the task redesign so it works end-to-end: resolve data persistence, backward transitions, reopen from resolved, and incident discovery via TCRS. * Update generated TypeScript types * refactor: single-query incident task lookup with parameterized statuses Replace two sequential queries (Open, InProgress) in getOrCreateIncident with one findByAboutAndTypeAndStatuses query using @BindList for status IN (...). --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix Playwright tests * Update generated TypeScript types * Fix linter * Fix tests * Fix tests * Fix checkstyle * Fix tests * Fix checkstyle * Update FeedResourceIT.java * Update TableRepository.java * fix tests * Update ActivityFeedProvider.tsx * fix tests * fix tests * Address Task comments * Fix unit test * Fix the feed summary panel showing on landing page * Fix comment functionality * Fix pytests * Fix failing playwright tests * Fix test flakiness * Fix ui-checkstyle * Fix advanced search spec failure * Fix playwright tests Co-authored-by: Copilot <copilot@github.com> * Fix checkstyle * Fix the flaky tests Co-authored-by: Copilot <copilot@github.com> * fix checkstyle * Reduce the workflow polling * Update generated TypeScript types * skip failing tests Co-authored-by: Copilot <copilot@github.com> * Fix ui-checkstyle --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com> Co-authored-by: IceS2 <pablo.takara@getcollate.io> Co-authored-by: karanh37 <karanh37@gmail.com> Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com> Co-authored-by: Copilot <copilot@github.com>
2026-04-23 13:52:30 +00:00
PRIMARY KEY (id),
UNIQUE KEY uk_fqn_hash (fqnHash),
KEY idx_task_id (taskId),
KEY idx_status (status),
KEY idx_category (category),
KEY idx_type (type),
KEY idx_priority (priority),
KEY idx_created_at (createdAt),
KEY idx_updated_at (updatedAt),
KEY idx_deleted (deleted),
KEY idx_status_category (status, category),
KEY idx_about_fqn_hash (aboutFqnHash),
KEY idx_status_about (status, aboutFqnHash),
KEY idx_created_by_id (createdById),
feat(dar): Granted lifecycle, filters, sort, and self-service create policy (#28044) * feat(dar): add Granted lifecycle, filters, sort, and self-service create policy Splits the Data Access Request lifecycle into Approved (awaiting grant) and Granted (active access) so the UI can show an "approved – awaiting grant" banner that clears once an admin marks the request as granted. Adds an indexed approvedBy/approvedById/approvedAt on Task, captured at the approve transition through a new direct-persist helper. Introduces a dedicated /v1/tasks/dataAccessRequests endpoint pre-scoped to category=DataAccess with DAR filters (dataset, service, status, requestedBy, approver, accessType) and an asc/desc sort on createdAt; generic /v1/tasks gains service/approver filters too. DataConsumerPolicy now grants Create on resource=task so authenticated non-admins can file a DAR (fixes "operations [Create] not allowed"). Reworks the workflow handler so transitions whose targetTaskStatus is non-terminal (Approved, Granted) don't close the task, and updates CreateTask.isTerminalTaskStatus to allow advancing between Approved → Granted stages. Adds a new "active" statusGroup that includes the DAR lifecycle states while preserving the existing open/closed semantics that Glossary-style workflows depend on. Includes a Postgres + MySQL migration for the indexed approvedById generated column and integration coverage in DataAccessRequestIT spanning the new lifecycle, filters, sorting, approver capture, and the non-admin policy path. Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: anuj-kumary <anujf0510@gmail.com> Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com> Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2026-05-16 14:35:15 +00:00
KEY idx_created_by_category (createdById, category),
KEY idx_approved_by_id (approvedById)
Task redesign (#25894) * Task Redesign: Add Task entity & tests * Task Redesign: Add Task entity & tests * Task Redesign: Add Permissions checks for Task APIs * Task UI changed to the new APIs * Migrate UI and APIs to new tasks system inlcuding suggestions * Add Suggestions integration * Activity Feed Refactor * ActivityFeed -> ActivityStream publisher * Activity Feed redesign * Activity Feed redesign, adding tests * Incident Manager update * Migrate Incidents to new tasks * Migrate Incidents to new tasks * Update generated TypeScript types * Update generated TypeScript types * feat(tasks): add domain-aware task cutover and workflow v2 migration * test(tasks): cover domain filters and task feed visibility flows * Address comments * Fix workflow tests to use new Task entity API and fix UserApprovalTaskV2 candidate transformation Migrated 9 WorkflowDefinitionResourceIT tests from legacy Feed/Thread API to the new Task entity API (UserApprovalTaskV2 creates Task entities, not Thread entities). Fixed a bug in UserApprovalTaskV2 where candidates were passed as raw EntityReferences instead of being transformed into users/teams FQN arrays for SetApprovalAssigneesImpl. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix tests * refactor: stabilize task entity workflows * refactor: finish task entity cutover and activity migration * refactor: migrate legacy thread feed during cutover * refactor: split legacy thread rename and archive migrations * Merge main; fix tests * Update generated TypeScript types * feat: advance task redesign through phase 2 * Merge main; fix tests * Update generated TypeScript types * Fix failing tests * Update generated TypeScript types * fininsh phase 6 of the design, configurable task forms * Update generated TypeScript types * Update generated TypeScript types * Fix linting * Address gitar comments * Address gitar comments * Fix build * Address giar comments * fix build * Add task custom forms * Fix tests * Address tests * Apply UI lint autofixes * Fix tess * Fix linter * Fix task patching * Fix tests * Fix playwright tests * fix java checkstyle * Add python sdk support for tasks, annoucements * Fix playwright tests * Fix playwright tests * Fix playwright tests * Fix python tests * Fix python tests * Fix linting workflows * fix pycheck * fix pycheck * Fix tests * Fix build * Address deviations from main and fix tests * Fix integration tests * Fix integration tests * Fix integration tests * Update generated TypeScript types * Fix Playwright tests * Fix Playwright tests * feat(incident): wire incident manager to task-first architecture (#27369) * feat(incident): wire incident manager to task-first architecture Connect the incident manager to the task redesign so it works end-to-end: resolve data persistence, backward transitions, reopen from resolved, and incident discovery via TCRS. * Update generated TypeScript types * refactor: single-query incident task lookup with parameterized statuses Replace two sequential queries (Open, InProgress) in getOrCreateIncident with one findByAboutAndTypeAndStatuses query using @BindList for status IN (...). --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix Playwright tests * Update generated TypeScript types * Fix linter * Fix tests * Fix tests * Fix checkstyle * Fix tests * Fix checkstyle * Update FeedResourceIT.java * Update TableRepository.java * fix tests * Update ActivityFeedProvider.tsx * fix tests * fix tests * Address Task comments * Fix unit test * Fix the feed summary panel showing on landing page * Fix comment functionality * Fix pytests * Fix failing playwright tests * Fix test flakiness * Fix ui-checkstyle * Fix advanced search spec failure * Fix playwright tests Co-authored-by: Copilot <copilot@github.com> * Fix checkstyle * Fix the flaky tests Co-authored-by: Copilot <copilot@github.com> * fix checkstyle * Reduce the workflow polling * Update generated TypeScript types * skip failing tests Co-authored-by: Copilot <copilot@github.com> * Fix ui-checkstyle --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com> Co-authored-by: IceS2 <pablo.takara@getcollate.io> Co-authored-by: karanh37 <karanh37@gmail.com> Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com> Co-authored-by: Copilot <copilot@github.com>
2026-04-23 13:52:30 +00:00
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
feat(dar): Granted lifecycle, filters, sort, and self-service create policy (#28044) * feat(dar): add Granted lifecycle, filters, sort, and self-service create policy Splits the Data Access Request lifecycle into Approved (awaiting grant) and Granted (active access) so the UI can show an "approved – awaiting grant" banner that clears once an admin marks the request as granted. Adds an indexed approvedBy/approvedById/approvedAt on Task, captured at the approve transition through a new direct-persist helper. Introduces a dedicated /v1/tasks/dataAccessRequests endpoint pre-scoped to category=DataAccess with DAR filters (dataset, service, status, requestedBy, approver, accessType) and an asc/desc sort on createdAt; generic /v1/tasks gains service/approver filters too. DataConsumerPolicy now grants Create on resource=task so authenticated non-admins can file a DAR (fixes "operations [Create] not allowed"). Reworks the workflow handler so transitions whose targetTaskStatus is non-terminal (Approved, Granted) don't close the task, and updates CreateTask.isTerminalTaskStatus to allow advancing between Approved → Granted stages. Adds a new "active" statusGroup that includes the DAR lifecycle states while preserving the existing open/closed semantics that Glossary-style workflows depend on. Includes a Postgres + MySQL migration for the indexed approvedById generated column and integration coverage in DataAccessRequestIT spanning the new lifecycle, filters, sorting, approver capture, and the non-admin policy path. Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: anuj-kumary <anujf0510@gmail.com> Co-authored-by: Ram Narayan Balaji <ramnarayanb3005@gmail.com> Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2026-05-16 14:35:15 +00:00
-- For 2.0.0 environments that ran the CREATE TABLE above before the
-- approvedById generated column was added inline, attach it now. CREATE TABLE
-- IF NOT EXISTS is a no-op on those environments so the column would never
-- appear otherwise. MySQL doesn't reliably support `ADD COLUMN IF NOT EXISTS`
-- across 8.0 versions and has no `ADD KEY IF NOT EXISTS`, so guard both via
-- information_schema.
SET @ddl = (
SELECT IF(
EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_schema = DATABASE()
AND table_name = 'task_entity'
AND column_name = 'approvedById'
),
'SELECT 1',
'ALTER TABLE task_entity ADD COLUMN approvedById varchar(36) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4''$.approvedById''))) STORED'
)
);
PREPARE stmt FROM @ddl;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
SET @ddl = (
SELECT IF(
EXISTS (
SELECT 1
FROM information_schema.statistics
WHERE table_schema = DATABASE()
AND table_name = 'task_entity'
AND index_name = 'idx_approved_by_id'
),
'SELECT 1',
'ALTER TABLE task_entity ADD KEY idx_approved_by_id (approvedById)'
)
);
PREPARE stmt FROM @ddl;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Task redesign (#25894) * Task Redesign: Add Task entity & tests * Task Redesign: Add Task entity & tests * Task Redesign: Add Permissions checks for Task APIs * Task UI changed to the new APIs * Migrate UI and APIs to new tasks system inlcuding suggestions * Add Suggestions integration * Activity Feed Refactor * ActivityFeed -> ActivityStream publisher * Activity Feed redesign * Activity Feed redesign, adding tests * Incident Manager update * Migrate Incidents to new tasks * Migrate Incidents to new tasks * Update generated TypeScript types * Update generated TypeScript types * feat(tasks): add domain-aware task cutover and workflow v2 migration * test(tasks): cover domain filters and task feed visibility flows * Address comments * Fix workflow tests to use new Task entity API and fix UserApprovalTaskV2 candidate transformation Migrated 9 WorkflowDefinitionResourceIT tests from legacy Feed/Thread API to the new Task entity API (UserApprovalTaskV2 creates Task entities, not Thread entities). Fixed a bug in UserApprovalTaskV2 where candidates were passed as raw EntityReferences instead of being transformed into users/teams FQN arrays for SetApprovalAssigneesImpl. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix tests * refactor: stabilize task entity workflows * refactor: finish task entity cutover and activity migration * refactor: migrate legacy thread feed during cutover * refactor: split legacy thread rename and archive migrations * Merge main; fix tests * Update generated TypeScript types * feat: advance task redesign through phase 2 * Merge main; fix tests * Update generated TypeScript types * Fix failing tests * Update generated TypeScript types * fininsh phase 6 of the design, configurable task forms * Update generated TypeScript types * Update generated TypeScript types * Fix linting * Address gitar comments * Address gitar comments * Fix build * Address giar comments * fix build * Add task custom forms * Fix tests * Address tests * Apply UI lint autofixes * Fix tess * Fix linter * Fix task patching * Fix tests * Fix playwright tests * fix java checkstyle * Add python sdk support for tasks, annoucements * Fix playwright tests * Fix playwright tests * Fix playwright tests * Fix python tests * Fix python tests * Fix linting workflows * fix pycheck * fix pycheck * Fix tests * Fix build * Address deviations from main and fix tests * Fix integration tests * Fix integration tests * Fix integration tests * Update generated TypeScript types * Fix Playwright tests * Fix Playwright tests * feat(incident): wire incident manager to task-first architecture (#27369) * feat(incident): wire incident manager to task-first architecture Connect the incident manager to the task redesign so it works end-to-end: resolve data persistence, backward transitions, reopen from resolved, and incident discovery via TCRS. * Update generated TypeScript types * refactor: single-query incident task lookup with parameterized statuses Replace two sequential queries (Open, InProgress) in getOrCreateIncident with one findByAboutAndTypeAndStatuses query using @BindList for status IN (...). --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix Playwright tests * Update generated TypeScript types * Fix linter * Fix tests * Fix tests * Fix checkstyle * Fix tests * Fix checkstyle * Update FeedResourceIT.java * Update TableRepository.java * fix tests * Update ActivityFeedProvider.tsx * fix tests * fix tests * Address Task comments * Fix unit test * Fix the feed summary panel showing on landing page * Fix comment functionality * Fix pytests * Fix failing playwright tests * Fix test flakiness * Fix ui-checkstyle * Fix advanced search spec failure * Fix playwright tests Co-authored-by: Copilot <copilot@github.com> * Fix checkstyle * Fix the flaky tests Co-authored-by: Copilot <copilot@github.com> * fix checkstyle * Reduce the workflow polling * Update generated TypeScript types * skip failing tests Co-authored-by: Copilot <copilot@github.com> * Fix ui-checkstyle --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com> Co-authored-by: IceS2 <pablo.takara@getcollate.io> Co-authored-by: karanh37 <karanh37@gmail.com> Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com> Co-authored-by: Copilot <copilot@github.com>
2026-04-23 13:52:30 +00:00
CREATE TABLE IF NOT EXISTS new_task_sequence (
id bigint NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO new_task_sequence (id) SELECT 0 WHERE NOT EXISTS (SELECT 1 FROM new_task_sequence);
-- =====================================================
-- ACTIVITY STREAM TABLE (Partitioned by time)
-- Lightweight, ephemeral activity notifications
-- NOT for audit/compliance - use entity version history
-- Partitions are managed dynamically by ActivityStreamPartitionManager
-- =====================================================
CREATE TABLE IF NOT EXISTS activity_stream (
id varchar(36) NOT NULL,
eventType varchar(64) NOT NULL,
entityType varchar(64) NOT NULL,
entityId varchar(36) NOT NULL,
entityFqnHash varchar(768) CHARACTER SET ascii COLLATE ascii_bin,
about varchar(2048),
aboutFqnHash varchar(768) CHARACTER SET ascii COLLATE ascii_bin,
actorId varchar(36) NOT NULL,
actorName varchar(256),
timestamp bigint NOT NULL,
summary varchar(500),
fieldName varchar(256),
oldValue text,
newValue text,
domains json,
json json NOT NULL,
PRIMARY KEY (id, timestamp),
KEY idx_activity_timestamp (timestamp),
KEY idx_activity_entity (entityType, entityId, timestamp),
KEY idx_activity_actor (actorId, timestamp),
KEY idx_activity_event_type (eventType, timestamp),
KEY idx_activity_entity_fqn (entityFqnHash, timestamp),
KEY idx_activity_about (aboutFqnHash, timestamp)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
PARTITION BY RANGE (timestamp) (
-- Catch-all partition - ActivityStreamPartitionManager will reorganize this
-- by splitting it into monthly partitions as needed
PARTITION p_max VALUES LESS THAN MAXVALUE
);
-- Activity stream configuration per domain
CREATE TABLE IF NOT EXISTS activity_stream_config (
id varchar(36) NOT NULL,
json json NOT NULL,
scope varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.scope'))) STORED NOT NULL,
domainId varchar(36) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.scopeReference.id'))) STORED,
enabled tinyint(1) GENERATED ALWAYS AS (json_extract(`json`,_utf8mb4'$.enabled')) STORED,
retentionDays int GENERATED ALWAYS AS (json_extract(`json`,_utf8mb4'$.retentionDays')) STORED,
PRIMARY KEY (id),
UNIQUE KEY uk_domain_config (domainId),
KEY idx_scope (scope),
KEY idx_enabled (enabled)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- =====================================================
-- ANNOUNCEMENT ENTITY TABLE
-- Standalone entity for asset announcements (migrated from thread_entity)
-- =====================================================
CREATE TABLE IF NOT EXISTS announcement_entity (
id varchar(36) NOT NULL,
json json NOT NULL,
fqnHash varchar(768) NOT NULL,
name varchar(256) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.name'))) STORED NOT NULL,
entityLink varchar(512) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.entityLink'))) STORED,
status varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.status'))) STORED,
startTime bigint GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.startTime'))) STORED,
endTime bigint GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.endTime'))) STORED,
createdBy varchar(256) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.createdBy'))) STORED,
createdAt bigint GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.createdAt'))) STORED,
updatedAt bigint GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.updatedAt'))) STORED,
deleted tinyint(1) GENERATED ALWAYS AS (json_extract(`json`,_utf8mb4'$.deleted')) STORED,
PRIMARY KEY (id),
UNIQUE KEY uk_announcement_fqn_hash (fqnHash),
KEY idx_announcement_status (status),
KEY idx_announcement_entity_link (entityLink),
KEY idx_announcement_start_time (startTime),
KEY idx_announcement_end_time (endTime),
KEY idx_announcement_deleted (deleted)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- =====================================================
-- TASK FORM SCHEMA ENTITY TABLE
-- Stores form schemas for different task types
-- =====================================================
CREATE TABLE IF NOT EXISTS task_form_schema_entity (
id varchar(36) NOT NULL,
json json NOT NULL,
fqnHash varchar(768) NOT NULL,
name varchar(256) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.name'))) STORED NOT NULL,
taskType varchar(64) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.taskType'))) STORED,
taskCategory varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.taskCategory'))) STORED,
updatedAt bigint GENERATED ALWAYS AS (json_unquote(json_extract(`json`,_utf8mb4'$.updatedAt'))) STORED,
deleted tinyint(1) GENERATED ALWAYS AS (json_extract(`json`,_utf8mb4'$.deleted')) STORED,
PRIMARY KEY (id),
UNIQUE KEY uk_task_form_schema_fqn_hash (fqnHash),
KEY idx_task_form_schema_name (name),
KEY idx_task_form_schema_task_type (taskType),
KEY idx_task_form_schema_deleted (deleted)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Context center (#27558) * Add Context Center: Migrate Knowledge Center , Images/ PDFs document support * Add Context Center: Migrate Knowledge Center , Images/ PDFs document support * Address PR #27558 review comments - KnowledgePageRepository: null-safe pageType in getHierarchyWithSearch and getHierarchyWithSearchForActivePage so the /search/hierarchy endpoint no longer NPEs when the pageType query param is omitted. The ES/OS client helpers already skip the pageType term when the value is null or empty, so this is a pure null-guard. - ContextFileResource.uploadFile: when a failure happens after the ContextFileContent row is created (e.g. inside extractionService.submit), the cleanup path now hard-deletes that content row so the DB is not left with an orphaned record. - ContextFileResource: replace the raw Content-Disposition string with a buildContentDisposition helper that emits both the legacy quoted filename= and the RFC 5987 filename*=UTF-8'' parameter with percent-encoded bytes, so international filenames round-trip while staying header-injection safe. sanitizeFileName also falls back to "download" on null/blank input. - ContextFileResourceTest: new cases for sanitizeFileName null/blank fallbacks and for buildContentDisposition ASCII/unicode/space/injection behaviour (18 tests, all passing). * Address copilot review comments on PR #27558 - AssetRepository.getByFqnPrefix: swap arguments so (assetType, fqnPrefix) matches the DAO signature — previous ordering always missed the index. - FolderResource / ContextFileResource getEntitySpecificOperations: return List.of() instead of null so callers iterating the returned list cannot NPE. - SearchUtils.getPageHierarchy: replace UUID.fromString with a parseUuid helper that returns null for missing/malformed values and logs a warning instead of failing the whole hierarchy response. - DaoListFilter: qualify the pageType column with the caller-provided tableName, rename getArticleCondition to getPageTypeCondition (legacy no-arg method kept as @Deprecated wrapper for compatibility). - Elastic/OpenSearch client processPageHierarchyHits: replace the per-hit getChildrenCountForPage search (N+1) with a single pass over the batch that derives childrenCount from pages whose parent is in the same result set. Also drops the now-unused helper and its throws clause. - openmetadata-sdk/pom.xml: mark JWT, JAX-RS client, Apache HttpClient, jakarta.json, parsson, and JUnit Jupiter as <optional>true</optional> so they don't leak into SDK consumers that only use the core client. - InMemoryAssetService: use the shared AsyncService executor for upload /read/delete instead of the JVM common ForkJoinPool. - sample-pricing.xlsx: replace the plain-text placeholder with a real minimal XLSX workbook so detection-based and extraction-based code paths see a valid Microsoft Excel 2007+ file. * Use one filters aggregation for page hierarchy childrenCount Follow-up to b8458e2868. The previous fix derived childrenCount from pages whose parent appeared in the same batch — that worked for listPageHierarchyForActivePage (which fetches all depths) but always returned 0 on the plain listPageHierarchy path (which only fetches one depth), so top-level listings lost the count semantically. Replace with a single `filters` aggregation keyed by page id: each named bucket matches descendants via a fullyQualifiedName prefix query against the page's FQN. That gives accurate direct-descendant counts for every returned page in one aggregation round-trip, still O(1) additional search requests regardless of batch size. * Add allowedFields entries for contextFile, folder, page Fixes SearchSettingsHandlerTest.testEveryAssetTypeHasCorrespondingAllowedFields. searchSettings.json already had assetTypeConfigurations for contextFile, folder, and page but no matching allowedFields entries, so the test that asserts every assetType has a corresponding allowedFields block failed with 'Asset type contextFile has no corresponding allowedFields entry'. Adds the three missing blocks with the fields that each index actually exposes — name / displayName (with .keyword and .ngram variants), description, fqn, fqnParts, tags/tier/domains/dataProducts, plus entity-specific fields (fileType/contentType/extractedText for contextFile, parent.displayName for folder/page, pageType for page). * Fix ui checkstyle * Fix Java checkstyle * Address PR #27558 copilot review round 2 - ES/OS populateChildrenCounts: add fqnDepth == parentDepth + 1 to the per-page filter so childrenCount is direct children only, matching the field name and the UI's isLeaf check semantics. Previously matched all descendants. - ES/OS buildPageNestedSearchHierarchy: filter out hits with a null id before Collectors.toMap, which would otherwise NPE when SearchUtils drops a malformed UUID. - SearchUtils.getPageHierarchy: wrap PageType.fromValue in a parsePageType helper that logs and returns null on unknown values, so a single bad hit can no longer break the whole hierarchy response. - TestSuiteBootstrap.setupMinIO: pin minio/minio to RELEASE.2024-01-16T16-07-38Z instead of :latest so a newly-published image cannot break integration tests without a code change. - createContextFile.json: rewrite the assetId description to be provider agnostic (S3 / Azure Blob / in-memory / no-op) and flag it as the legacy path, preferring headContentId / ContextFileContent. * Update generated TypeScript types * Address PR #27558 copilot review round 3 - bootstrap/sql/migrations/native/2.0.0/mysql/schemaChanges.sql: - asset_entity: add PRIMARY KEY (id); mark all generated columns STORED for consistency with the other drive/knowledge tables in the same migration; compute deleted as a real boolean via IFNULL(JSON_EXTRACT(json, '$.deleted'), FALSE) so the boolean index behaves correctly. - knowledge_center: mark name, updatedAt, updatedBy, pageType as STORED and apply the same deleted expression so the existing indexes on name and (fqnHash, deleted) are reliable on fresh installs. - drive_folder / context_file / context_file_content: update the deleted generated column to use the same boolean-safe expression. - ElasticSearch/OpenSearch hierarchy search: add an explicit sort on fullyQualifiedName ASC with _id ASC as tiebreaker so from/size pagination is deterministic and cannot skip/duplicate pages between requests. * Fix UI checkstyle * Address PR #27558 copilot review round 4 - createPage.json: rewrite the field descriptions for name, displayName, owners, reviewers, and entityStatus. They were copy/pasted from other schemas ('query', 'tag') and were misleading in generated docs and clients. - NoOpAssetService.generateDownloadUrlWithExpiry: return asset.getUrl() instead of a synthetic 'https://cdn.example.com/...' URL. The old behaviour let clients attempt downloads that would never resolve when object storage was disabled; returning the asset's own (empty) URL surfaces the misconfiguration cleanly. - AzureAssetService: normalize the prefix path the same way S3 does. Previously a null/blank prefix produced the literal 'null/' prefix, writing blobs under the wrong key. New formatPrefix returns "" for null/blank and ensures exactly one trailing '/' for a real prefix. - AssetRepository.getByFQN: treat null *or* empty list as 'not found', matching getByFqnPrefix. Callers previously received an empty list silently when the DAO returned [] instead of a 404. * Update generated TypeScript types * Fix UI checkstyle * Address PR #27558 copilot review round 5 - AssetDAO.update / AssetRepository.update: switch the UPDATE target from fqnHash to id. Two assets can share the same fullyQualifiedName (e.g. successive revisions of the same context file), so the old SQL could silently update sibling rows. - ContextFileExtractionService: run the extraction pipeline on a dedicated fixed thread pool instead of AsyncService.getExecutorService. process() blocks on assetService.read(...).join(), and S3/Azure reads are themselves scheduled on AsyncService — sharing the same bounded pool risks starving those reads (and deadlocking) once every thread is busy running extractions. - postgres/schemaChanges.sql: wrap the generated deleted column in COALESCE((json ->> 'deleted')::boolean, false) (and the asset_entity CAST variant) so an absent 'deleted' key is stored as FALSE, not NULL. Otherwise "non-deleted" filters based on the boolean index drop rows silently. Matches the MySQL IFNULL(..., FALSE) side of the migration. - ContextFileUploadSupport.sanitizeEntityName: treat null/blank input as 'file' instead of NPE-ing on replaceAll. Multipart uploads can arrive without filename metadata; the upload should still succeed with a stable generated name. * Remove macOS-only @rollup/rollup-darwin-arm64 dev dep I pinned this during local troubleshooting to get a Vite dev server running on macOS (rollup's optional native binary was missing). CI runs on Linux, where yarn install --frozen-lockfile refuses the package ('The platform \"linux\" is incompatible with this module'), which broke license-header, lint-src, lint-playwright, i18n-sync, app-docs, and ui-coverage-tests for PR #27558. rollup re-resolves its native binary per platform — there's no reason to pin the darwin one. Remove it from package.json and drop the matching '@rollup/rollup-darwin-arm64@^4.60.2' block from yarn.lock. * Re-declare optional SDK test deps on integration-tests classpath KnowledgeCenterIT failed in CI with 'java.lang.NoClassDefFoundError: org/glassfish/jersey/apache/connector/ApacheConnectorProvider' after I marked the JAX-RS client stack in openmetadata-sdk as <optional>true</optional> during review round 2. That change stops the deps from leaking to every SDK consumer, but integration-tests actually uses org.openmetadata.sdk.test.util.RestClient, so the optional deps must be re-declared on its own classpath. Adds jakarta.ws.rs-api, jersey-client, jersey-apache-connector, httpclient, jakarta.json-api, and parsson to openmetadata-integration-tests/pom.xml as <scope>test</scope>. * Fix IT failures from CI integration-tests-mysql-elasticsearch 1. MySQL deleted column: revert the IFNULL wrapper to plain (json -> '$.deleted'). My earlier IFNULL(JSON_EXTRACT(json, '$.deleted'), FALSE) hit 'Incorrect integer value: false for column deleted' on fresh installs because MySQL cannot coerce the resulting JSON scalar into TINYINT(1) when the column is STORED. The bare '(json -> '$.deleted')' form is what other OM tables already use, and MySQL converts JSON true/false to 1/0 directly for the BOOLEAN column. STORED + PRIMARY KEY stay in place. 2. DriveFileUploadIT: raise the four short atMost(5s) awaits to 20s with explicit pollDelay(ZERO) + pollInterval(200ms). K8sOMJobOperatorIT sets a global Awaitility pollInterval of 5s at class setup; any subsequent test with atMost <= 5s hits 'Timeout must be greater than the poll delay'. Overriding the per-call poll settings insulates these asserts from the global leak. * Document SDK test-utility optional deps In review round 2 we marked jersey-client, jersey-apache-connector, jakarta.ws.rs-api, httpclient, jakarta.json-api, parsson, java-jwt, and junit-jupiter-api as <optional>true</optional> on openmetadata-sdk so that core SDK consumers don't inherit a heavy JAX-RS + JUnit stack. openmetadata-integration-tests hit this immediately with NoClassDefFoundError from RestClient; its own pom now re-declares the deps. Add a "Test utilities" section to the SDK README that lists the optional deps downstream test-utility consumers must re-declare (with the concrete <scope>test</scope> XML snippet) and explains the error they'd otherwise see. * NoOpAssetService: never return null from generateDownloadUrlWithExpiry In review round 4 I changed this method to return asset.getUrl() when the asset is non-null. But Asset.url is optional in the schema, so asset.getUrl() itself can be null — which breaks the implied "never returns null" contract downstream callers rely on (AttachmentResource only null-checks defensively). Normalize null and blank URLs to an empty string so the method's non-null, non-blank contract holds even when storage is disabled and the asset was never populated with a URL. * AssetServiceFactory: swap to NoOp when re-initialized with storage off init(...) previously only assigned NoOpAssetService when instance was null. On a re-init with object storage toggled off (config reload, test teardown, etc.), the previously wired S3/Azure/InMemory provider stayed live and kept serving real IO against a backend the operator thought was disabled. Replace the instance with a fresh NoOp when storage is disabled unless the instance is already a NoOp (idempotent on repeated disabled inits). * Type create-request domains arrays as fullyQualifiedEntityName The three new KC/Drive create schemas (createFolder, createContextFile, createPage) had domains as an array of unconstrained strings. The rest of the OM API models domain references as FQNs, and the shared basic.json#/definitions/fullyQualifiedEntityName is the convention for this. Point all three items refs at fullyQualifiedEntityName so generated clients see a consistent FQN type and requests get validated for non-empty length/format rather than any string. * Update generated TypeScript types * Address PR #27558 copilot review 4144965142 - ContextFileExtractionService: switch the default thread pool to a static final DEFAULT_EXECUTOR, so every production instance of the service reuses the same pool instead of leaking a fresh fixed pool per construction (tests especially create multiple instances). Threads remain daemons, so the pool never blocks JVM shutdown. - ObjectDeleteQueueService: when queueCapacity is 0, use a SynchronousQueue so "reject-if-all-workers-busy, no buffering" holds. Previous Math.max(1, queueCapacity) silently allocated a 1-slot ArrayBlockingQueue, contradicting the caller's stated capacity and potentially buffering one task past the semaphore's accounting. Not fixing: - SearchUtils @Slf4j 'LOG' vs 'log'. OM's openmetadata-service/lombok.config sets 'lombok.log.fieldName = LOG', so @Slf4j correctly generates 'LOG' for every class in this module. The reviewer's concern only applies to projects without that directive. Verified clean compile. * Address PR #27558 copilot review 4144917449 - knowledgeCenterTags.json: change mutuallyExclusive from the string "false" to the JSON boolean false. The Classification schema declares this as `"type": "boolean"`; jackson's lenient string->boolean coercion masked it until now, but strict validators would reject and the other OM bootstrap tag files that use the correct boolean (piiTagsWithRecognizers.json) model what this should look like. - ContextFileExtractionService.process: guard the updateContent updater with the same head-content check already used in updateFile. Previously, if headContentId flipped between the initial check and the status writes, updateFile would no-op while updateContent still marked the now-stale content "Analyzing", leaving it stuck once the later early-return fires. - AzureAssetService.upload: stream the InputStream straight to the blob using the known asset.getSize() instead of reading the whole payload into a byte[] via IOUtils.toByteArray. Matches the S3 streaming behaviour and avoids full-file heap pressure / OOM risk on larger files. Buffered fallback retained when size is unknown. - Size fields modeled as integer: flip fileSize / size on createContextFile.json, contextFile.json, asset.json, createAsset.json, and contextFileContent.json from "type": "number" to "type": "integer" with "format": "int64" and "minimum": 0. Byte counts are inherently whole numbers; floating point loses precision above 2^53 and makes validation murky. Update the (double) call sites in ContextFileResource, ContextFileUploadSupport, and AttachmentResource to match. Not fixing: - ContextEntityPromptService "unused Authorizer import" — false positive, the class uses it in the constructor. - NoOpAssetService.generateDownloadUrlWithExpiry null return — already fixed earlier in commit a4a2dcc91d (returns "" when url is null/blank). * AssetService.read: run inline instead of hopping through AsyncService Every caller of AssetService.read(...) immediately .join()s on the returned future: - ContextFileExtractionService.process reads + extracts - ContextFileResource.downloadFile reads + streams back - AttachmentResource.serveAsset reads + streams back - QueuedDeleteAssetService just delegates None of them exploit the async nature, but the S3/Azure/InMemory implementations all wrapped the blocking fetch in AsyncService.executeAsync or CompletableFuture.supplyAsync on a bounded pool. That created a starvation path when any caller thread was already running on AsyncService (or could monopolize it under load) — join() would block the caller while the submitted read task fought for a free worker. Switch S3, Azure, and InMemory read() to execute on the caller's thread and return CompletableFuture.completedFuture(...). Interface is unchanged so existing .join() callers keep working; the extra thread hop and the potential for AsyncService starvation are both gone. Combined with the dedicated context-file-extraction pool, the extraction pipeline no longer touches AsyncService for any asset-read step. * Address PR #27558 copilot review 4151211562 - FolderIndex / ContextFileIndex: stop re-setting entityType, deleted, owners, totalVotes inside buildSearchIndexDocInternal. Those common fields are populated by populateCommonFields in the SearchIndex template method (Phase 1) before Phase 3 calls the entity-specific internal builder, so the explicit puts were redundant and silently overrode the template output. Aligns with PageIndex convention and updates the unit tests to assert the internal builder sets only entity-specific fields. - ContextFileTextExtractor: bound the Tika BodyContentHandler at MAX_CANONICAL_TEXT_LENGTH instead of passing -1 (unbounded) so a pathological image cannot drive OCR to accumulate arbitrary output on the heap. - ContextFileExtractionService: replace the unbounded Executors.newFixedThreadPool backing queue with a ThreadPoolExecutor using an ArrayBlockingQueue + AbortPolicy. Without a bounded queue the RejectedExecutionException handling in submit(...) was dead code; with it, an overloaded server surfaces a "retry later" failure status instead of silently accumulating work. - S3AssetService / AssetService / AssetServiceFactory / QueuedDeleteAssetService: make AssetService extend AutoCloseable with a default no-op, override close() in S3AssetService to release the S3Client and S3Presigner connection pools, and register a shutdown hook in AssetServiceFactory that closes the current provider on JVM exit (and on re-init when the provider changes). - bootstrap 2.0.0 MySQL schemaChanges: change the deleted generated column from (json -> '$.deleted') to (JSON_EXTRACT(json, '$.deleted') IS TRUE) so rows where the JSON key is absent resolve to FALSE instead of NULL. Avoids filter misses on the composite (fqnHash, deleted) index. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Java checkstyle * Fix integration test compile + S3 generateDownloadURL ContextFileIT / DriveFileUploadIT compile failures came from the fileSize schema switch to integer/int64 — the generated setter/getter is now Integer. Replace the double literals with ints and the assertEquals(double, ...) sites with intValue() so the (int, int) overload resolves unambiguously. Also override S3AssetService.generateDownloadURL to return a short-lived presigned URL (mirroring AzureAssetService) instead of inheriting the default, which would return the raw S3 key from asset.url. Addresses review 4151282021. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Revert MySQL deleted column back to bare json -> expression The JSON_EXTRACT(...) IS TRUE form broke integration tests — GET after create started returning 404, consistent with MySQL evaluating the IS TRUE predicate against the JSON scalar in a way that stored 1 instead of 0 for freshly-created rows (deleted=false). Restoring the bare (json -> '$.deleted') expression used pre-review. Rows with the key missing will store NULL on the generated column, which is a theoretical concern the review flagged but does not affect current code paths (all inserts write json.deleted explicitly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Transi18next import path in KnowledgeCenter components Two KnowledgeCenter files imported Transi18next from 'utils/CommonUtils', which is where Collate's UI re-exports it from. OpenMetadata core exports Transi18next from 'utils/i18next/LocalUtil' (same path every other core file uses). The Collate-style import broke the production Vite/Rollup build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Harden ContextFileIT.testFileAppearsInSearch against async indexing The test used a fixed Thread.sleep(2000) then a single assertEquals on the status code. That was flaky two ways: ES indexing is async and the 2s window is not always enough, and on a fresh cluster the context_file_search_index itself may not exist yet at first query (yielding 500). Replace with an await() loop that polls every 200ms for up to 30s and asserts both status==200 AND that the newly-created file's UUID appears in the response. Matches the assertSearchContainsFile helper in DriveFileUploadIT. Also URL-encode the namespaced query string so the uniqueName does not break the query parsing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Make playwright editor shortcuts platform-aware The SHORTCUTS constant in playwright/constant/KnowledgeCenter.constant.ts hard-coded "Meta+b" / "Meta+z" / etc. On macOS Meta is Cmd and those shortcuts trigger bold / undo / copy as expected, but on the Linux CI runners Meta is the Super (Windows) key — so every ProseMirror formatting and history test just pressed Super+b, which does nothing, and the test then fails waiting for the <strong>…</strong> element (or for the undone text to disappear). Detect the runner platform and use Meta on macOS, Control everywhere else — matching the same pattern in src/constants/KnowledgeCenter.constant.ts. Unblocks the 6 KnowledgeCenterTextEditor failures across Admin / Data Consumer / Data Steward roles (Text Formatting + Undo/Redo). Slash commands keep passing because they don't depend on modifier keys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Run prettier on DateTimeUtils.ts CI's lint-src job fails because ESLint+Prettier --fix produces a non-empty diff against the committed tree. Local prettier pass trimmed the indentation and added a trailing comma in the imports block. No behavioral change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Knowledge Page entity-link + DAO filter regressions from the port Downloaded the failing playwright traces from the PR's postgres e2e run and walked each one. Three distinct bugs, all present because the Collate-side overrides (overrides/EntityUtilClassCollate.ts and the DaoExtension.KnowledgeExtensionDAO custom SQL) were not carried over into OpenMetadata core when KnowledgeCenter was merged up. 1) CollectionDAO.KnowledgePageDAO: override listCount / listBefore / listAfter (plus helper SQL queries) so that `GET /v1/knowledgeCenter?entityId=X&entityType=topic` actually INNER JOINs entity_relationship and returns only pages whose relatedEntities contains the target entity. Without this the base EntityDAO ignored entityId/entityType entirely and returned every page, which is why the "Knowledge Articles" widget on a data asset page showed the 15 fixture articles instead of the one just attached — and why updateDataAsset timed out waiting for the linked article. Uses OWNS relation for user/team filters (same semantics Collate uses) and HAS for every other entity type. 2) EntityUtilClassBase + EntityUtils.getEntityLinkFromType: add EntityType.KNOWLEDGE_PAGE cases that route to getKnowledgePagePath. Before this, mention notifications for Knowledge Pages fell through to the default `/table/<fqn>` branch (confirmed in the captured page-snapshot: the mention link pointed at `/table/Article_eEqrWeeU`), which 404'd on the Table API and rendered an error page — so the entity-header-display-name textarea never appeared and the User Mentions test timed out. Search results on Explore had the same problem, rendering every Knowledge Page result card with href="/". 3) EntityUtilClassBase.getEntityByFqn / ENTITY_PATCH_API_MAP / getResourceEntityFromEntityType: handle KNOWLEDGE_PAGE end-to-end so the detail-page fetch, patches, and policy lookups all route through the knowledgeCenter REST API rather than falling back to the generic entity utilities (which don't know about the 'page' entity type). Verified against the real trace artifacts from CI run 24790718035: - shard 3 Knowledge Center page test — widget shows 10 unrelated "Article_*" fixture items instead of the created one → root cause is the missing DAO JOIN (#1). - shard 3 User Mentions test — notification link is /table/, not /knowledge-center/ (#2). - shard 3 Reviewer Workflow — data consumer's knowledge-center goto renders "No data available" because getEntityByFqn fell back to a table fetch for a page FQN (#3). - shard 5 ExplorePageRightPanel_KnowledgeCenter (22 failures) — search result card links are "/explore/" (empty), same root cause as (#2) inside getEntityLinkFromType default branch. Compiles: mvn -pl openmetadata-service -q -DskipTests compile passes; tsc --noEmit reports no new errors in the touched files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address remaining PR #27558 review feedback Seven actionable fixes drawn from the still-open review threads; the rest of the open threads in copilot's bot reviews are either already addressed in earlier commits or stale against the current code and are being resolved on the review UI alongside this commit. - AssetRepository.getByFQN: the LOG.error message said "asset with id" but was printing the FQN. Relabel to "asset with FQN" for accurate troubleshooting (thread #42). - KnowledgePageMapper.createToEntity: stop mutating the inbound CreatePage by calling create.withRelatedEntities(...). Build the effective list as a local variable and pass it to copy(...). Prevents the Organization fallback from leaking into the caller's request object, which is surprising when the request is re-used or logged (thread #43). - FolderIndex: default childrenCount to 0 when the entity hasn't yet had its children recomputed (e.g. a freshly created folder). Prevents the numeric field from being indexed as missing, which broke range and sort queries that assume it is always present (thread #46). - NoOpAssetService and InMemoryAssetService: override generateDownloadURL to delegate to generateDownloadUrlWithExpiry, matching S3/Azure. Without this, callers using the non-expiry API got asset.getUrl() (often empty for these providers), yielding broken download links (threads #39, #45). - ObjectDeleteQueueService: register a JVM shutdown hook in the singleton's initializer that calls stop(). The service already implements Dropwizard Managed, but nothing currently wires it into the application lifecycle, so non-daemon delete-worker threads were at risk of keeping the JVM alive after ungraceful termination. The hook is a belt-and-suspenders fallback to the Managed path (threads #52, #53). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add java-checkstyle skill for Claude + Codex agents CI keeps surfacing "Java checkstyle failed — please run mvn spotless:apply" comments on PRs (including this branch). CLAUDE.md and AGENTS.md already mentioned the command, but a one-line prose note in the middle of each file wasn't enough to make it a reliable habit. This commit: - Adds a dedicated invocable skill at .claude/skills/java-checkstyle/SKILL.md (for the Claude Code harness) and a mirror at .agents/skills/java-checkstyle/SKILL.md (for Codex-style agents). Both describe the same procedure: when / why to run spotless, the `-pl <module>` scoping option, the verify-only `spotless:check` form, the expected diff shape, and the rule to never hand-edit formatting around a plugin error. - Promotes the existing one-liners in CLAUDE.md and AGENTS.md to explicit "run before finishing any Java task" instructions, pointing at the skill so agents have a reusable procedure to invoke rather than improvising. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Harden AttachmentResource upload/download against three regressions Carried over from the latest AttachmentResource review. Three issues: 1. Content-Disposition header injection (security) — downloadAsset() built the header by direct string interpolation of asset.getFileName(). A filename containing double-quotes or CRLF could inject arbitrary HTTP headers. ContextFileResource already has a sanitize + RFC-5987 encode helper; rather than duplicate it, promote ContextFileUploadSupport.sanitizeFileName / buildContentDisposition to public, delete the duplicates from ContextFileResource (now delegators), and reuse the shared helpers from AttachmentResource. 2. Unbounded upload buffering (performance / DoS) — createAssetFromUpload read the full multipart body into a byte[] via IOUtils.toByteArray before checking against MAX_FILE_SIZE. An attacker could send an arbitrarily large body and exhaust heap before the validation ran. Replace with ContextFileUploadSupport.bufferUpload(), which streams to a bounded temp file and throws MaxFileSizeExceededException the moment the configured limit is passed; translate that into the same AttachmentException size-validation error the previous code raised. Promoted BufferedUpload and MaxFileSizeExceededException to public so the attachments package can consume them. 3. Startup NPE when objectStorage is null (bug) — initialize() called config.getObjectStorage().getMaxFileSize() without a null guard, so a deployment that doesn't configure object storage would NPE on server start. Added the same guard ContextFileResource.initialize() already uses, gave MAX_FILE_SIZE a safe 5 MiB default, and also null-guarded the S3-configuration branch of the CDN URL lookup so a pure-Azure or pure-NoOp setup doesn't fall off the end of the ternary. Ran mvn spotless:apply — picks up formatting-only changes in CollectionDAO.java and FolderIndex.java as a side effect of the shared helper additions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add ui-checkstyle skill + fix residual import-order drift CI's UI Checkstyle workflow has three per-area jobs (lint-src, lint-playwright, lint-core-components) that reformat the files changed in the PR and fail if the reformat produces a diff. CLAUDE.md and AGENTS.md didn't previously document this flow, so re-running the fix was a guessing game — the two lint-core-components and lint-playwright failures on this branch came from stale import order left over from the main→context_center merge. This commit: - Adds a dedicated invocable skill at .claude/skills/ui-checkstyle/SKILL.md (Claude Code harness) and a mirror at .agents/skills/ui-checkstyle/SKILL.md (Codex-style agents). Both describe the exact three-command sequence CI runs — organize-imports-cli → eslint --fix → prettier --write — the per-area file scoping, the `--check` dry-run mode, and the rule that organize-imports must run BEFORE prettier (otherwise the indentation / trailing-comma round-trip leaves a dirty diff). - Promotes the existing one-liner in CLAUDE.md and AGENTS.md to an explicit "run before finishing any UI task" instruction that points at the skill. - Fixes two residual import-order drifts (KnowledgePagesHierarchy.tsx, EntityUtilClassBase.ts) surfaced by running the skill's sequence locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix UI checkstyle on EntityUtilClassBase.ts ESLint --fix inserted a blank line between the KNOWLEDGE_PAGE guard and the fallback return in getEntityByFqn. Committing the formatted version. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix ContextFileIT.testFileAppearsInSearch flaky 500 from query_string parsing The previous polling search used the namespaced unique name as a free-text q= argument. The namespace prefix contains '-' which the ES 9.x query_string parser treats as a NOT operator, producing a deterministic 500 across the full 30s polling window even when the document was indexed. Switch to the direct get-by-id endpoint (/v1/search/get/{index}/doc/{id}), which performs a real-time ES GET with no query_string parsing and no analyzer involvement — the most reliable signal that the document was indexed. Bump the timeout to 60s and capture the response body on any non-200 so future regressions surface the real ES error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Fix knowledge center icon * update knowledge center to context center Co-authored-by: Copilot <copilot@github.com> * Revert "update knowledge center to context center" This reverts commit f0cca5fd65f31285b69ecc9c3afa7624eee9e055. * Fix UI checkstyle: sort tag*-related imports in SearchClassBase Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Jest coverage failures in KnowledgeCenter Layout and right panel KnowledgeCenterLayout was importing i18n directly from LocalUtil, but the global setupTests mock for that module only exposes t/on. Switch to the useTranslation() hook so it picks up the react-i18next mock that already provides i18n.dir(), matching how LeftSidebar and RichTextEditor use the direction. EntityRightPanelClassBase.getKnowLedgeArticlesWidget now returns the KnowledgePages component instead of null. Update the corresponding test case to assert the new return value. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix playwright tests and bugs Co-authored-by: Copilot <copilot@github.com> * Fix checkstyle * Fix /knowledgeCenter/search/hierarchy 500 by removing _id sort ES 9.x and OpenSearch 3.x reject sorts on the _id field by default (indices.id_field_data.enabled is false), causing every call to listPageHierarchy{,ForActivePage} to fail the search_phase_execution_exception "all shards failed" we see in the screenshot. The _id sort was added in 4a75852a7e as a tiebreaker for from/size pagination, but fullyQualifiedName is already a keyword field with doc_values and is unique per page (name is unique within a parent's children) — so no tiebreaker is needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Cascade hard-delete to descendant pages in search index KnowledgeCenter pages are nested via FQN (parent.fqn -> parent.fqn.child), not via a parent.id field on the child doc. The default deleteOrUpdateChildren case for entity type "page" uses page.id field matching, which doesn't exist on child page docs — so a recursive hard-delete on the parent removed the parent from search but left every descendant orphaned in the index. Stale docs only disappeared on a full reindex. This logic was overridden in the collate fork's SearchRepositoryExt; it was lost during the migration when the override class was removed. Fold the override into the base SearchRepository as a Page-specific case that calls deleteEntityByFQNPrefix, which deletes by fullyQualifiedName.keyword prefix match — covering every descendant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Add page/folder/contextFile/securityService to SearchIndexingApp picker The Search Indexing Application's "Entities" picker shows "No data" when typing "Page" because the enum in src/utils/ApplicationSchemas/SearchIndexingApplication.json does not include the Knowledge Center / Drive entity types added on this branch. The collate fork carried these in SearchIndexingApplication-collate.json (included page); folder, contextFile and securityService are new on this branch and never made it into the picker enum during the migration. Without them in the enum, users cannot select these entity types for targeted reindex, even though every other reindex code path supports them. src/jsons/applicationSchemas/* is generated by parseSchemas.js from src/utils/ApplicationSchemas/* at build time and is gitignored, so only the source schema is updated here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Restore live index settings on per-entity distributed-promote path DefaultRecreateHandler exposes two finalization paths: - finalizeReindex(...) — centralized end-of-job promotion. Calls applyLiveServingSettings + maybeForceMerge before the alias swap, reverting the bulk overrides (refresh_interval=-1, replicas=0, async translog) back to live values (refresh=1s, replicas=1, durable translog). - promoteEntityIndex(ctx, ok) — per-entity promotion. Used by the distributed search-indexer's "promote as soon as all partitions for an entity complete" callback (DistributedSearchIndexExecutor.promoteEntityIndex). Swaps the alias and cleans up old indices — but never restored live settings. When an entity finishes its partitions before the final reconciliation (typically the smallest entities — e.g. knowledge `page` with ~11 rows), its index is promoted via the per-entity path, the alias swap succeeds, and the bulk-build overrides become the new live settings. refresh_interval stays at -1 in production, so live writes after the reindex are buffered in the translog and never reach searchable segments until a manual _refresh. Externally this surfaces as "create an article, hierarchy is empty until I re-trigger reindex" — exactly the user-reported bug. Mirror the finalizeReindex sequence by calling applyLiveServingSettings (and maybeForceMerge for parity) at the top of the promote block in promoteEntityIndex, before the alias swap. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Wire jobData into per-entity reindex promotion handler DefaultRecreateHandler.applyLiveServingSettings reads from the handler's jobData field (live + bulk index-settings overrides on the EventPublisherJob). The per-entity distributed-promotion path in DistributedSearchIndexExecutor created its own DefaultRecreateHandler instance and never called withJobData(jobData) on it. With jobData=null, buildRevertJson returns null and applyLiveServingSettings silently no-ops — meaning the previous fix (b272de85f9) never actually re-applied live settings on the per-entity promote path, even though the call was reached. currentJob.getJobConfiguration() is the EventPublisherJob the strategy created. Wire it into the new handler at construction time, mirroring the withJobData call DistributedIndexingStrategy already makes on the strategy's own handler instance. With this change, the per-entity promote path now logs "Applying live serving settings to staged index '...' for entity 'page': {\"number_of_replicas\":1,\"refresh_interval\":\"1s\", ...}" before the alias swap, and post-promotion `_settings` show refresh_interval=1s instead of the stuck -1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Fix delete failure * Fix java checkstyle * Fix article deletion issue * refactor(test): streamline Knowledge Center List setup and teardown processes * Fix GlossaryTags * Add missing pieces in knowledge articles * Fix checkstyle * Remove reviewer workflow spec * remove unused util * Fix the localization changes * Fix unit tests * deleted unused svg * added missing svg * improved ux of save button & autofocus on title * lint fixes * Update page index * Make calculateFqnDepth static * fixed the kc imports * import fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com> Co-authored-by: Copilot <copilot@github.com> Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com> Co-authored-by: Rohit0301 <rj03012002@gmail.com> Co-authored-by: Rohit Jain <60229265+Rohit0301@users.noreply.github.com> Co-authored-by: Harsh Vador <harsh.vador@somaiya.edu>
2026-05-08 17:56:04 +00:00
-- =====================================================
-- KNOWLEDGE CENTER + CONTEXT CENTER DRIVE (Collate → OM port)
-- Appended below the Task Redesign tables to preserve main's
-- migration order when merging.
-- =====================================================
-- MCP tables are created in 1.13.0 migration.
-- Knowledge Center: page entity table (Article, QuickLink).
-- Existing Collate customers already have this table from 1.2.0-collate with
-- subsequent shape changes through 1.6.0-collate (nameHash -> fqnHash VARCHAR(756),
-- pageType generated column, composite deleted index). CREATE TABLE IF NOT EXISTS
-- is a no-op for them and creates the final shape for fresh OpenMetadata installs.
CREATE TABLE IF NOT EXISTS knowledge_center (
id VARCHAR(36) GENERATED ALWAYS AS (json ->> '$.id') STORED NOT NULL,
fqnHash VARCHAR(756) NOT NULL COLLATE ascii_bin,
name VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.name') STORED NOT NULL,
json JSON NOT NULL,
updatedAt BIGINT UNSIGNED GENERATED ALWAYS AS (json ->> '$.updatedAt') STORED NOT NULL,
updatedBy VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.updatedBy') STORED NOT NULL,
deleted BOOLEAN GENERATED ALWAYS AS (json -> '$.deleted') STORED,
pageType VARCHAR(16) GENERATED ALWAYS AS (json ->> '$.pageType') STORED NOT NULL,
PRIMARY KEY (id),
UNIQUE (fqnHash),
INDEX knowledge_center_name_index (name),
INDEX index_knowledge_center_deleted (fqnHash, deleted)
);
-- Context Center Drive: Folder entity table.
CREATE TABLE IF NOT EXISTS drive_folder (
id VARCHAR(36) GENERATED ALWAYS AS (json ->> '$.id') STORED NOT NULL,
name VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.name') STORED NOT NULL,
nameHash VARCHAR(256) NOT NULL COLLATE ascii_bin,
json JSON NOT NULL,
updatedAt BIGINT UNSIGNED GENERATED ALWAYS AS (json ->> '$.updatedAt') STORED NOT NULL,
updatedBy VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.updatedBy') STORED NOT NULL,
deleted BOOLEAN GENERATED ALWAYS AS (json -> '$.deleted') STORED,
PRIMARY KEY (id),
UNIQUE KEY unique_drive_folder_name (nameHash),
INDEX idx_drive_folder_updated_at (updatedAt)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- Context Center Drive: File entity table (uploaded PDF/image/spreadsheet/office docs).
CREATE TABLE IF NOT EXISTS context_file (
id VARCHAR(36) GENERATED ALWAYS AS (json ->> '$.id') STORED NOT NULL,
name VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.name') STORED NOT NULL,
nameHash VARCHAR(256) NOT NULL COLLATE ascii_bin,
json JSON NOT NULL,
updatedAt BIGINT UNSIGNED GENERATED ALWAYS AS (json ->> '$.updatedAt') STORED NOT NULL,
updatedBy VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.updatedBy') STORED NOT NULL,
deleted BOOLEAN GENERATED ALWAYS AS (json -> '$.deleted') STORED,
PRIMARY KEY (id),
UNIQUE KEY unique_context_file_name (nameHash),
INDEX idx_context_file_updated_at (updatedAt)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- Attachments: Asset entity table for uploaded file blobs referenced by ContextFiles, Pages, etc.
-- Existing Collate customers have this from 1.7.0-collate. CREATE TABLE IF NOT EXISTS is a no-op for them.
CREATE TABLE IF NOT EXISTS asset_entity (
id VARCHAR(36) GENERATED ALWAYS AS (json ->> '$.id') STORED NOT NULL,
name VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.fileName') STORED NOT NULL,
url VARCHAR(1024) GENERATED ALWAYS AS (json ->> '$.url') STORED NOT NULL,
fullyQualifiedName VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.fullyQualifiedName') STORED NOT NULL,
assetType VARCHAR(100) GENERATED ALWAYS AS (json ->> '$.assetType') STORED NOT NULL,
json JSON NOT NULL,
updatedAt BIGINT UNSIGNED GENERATED ALWAYS AS (json ->> '$.updatedAt') STORED NOT NULL,
updatedBy VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.updatedBy') STORED NOT NULL,
fqnHash VARCHAR(768) CHARACTER SET ascii COLLATE ascii_bin DEFAULT NULL,
deleted BOOLEAN GENERATED ALWAYS AS (json -> '$.deleted') STORED,
PRIMARY KEY (id),
INDEX fqnhash_index (fqnHash),
INDEX asset_type_index (assetType),
INDEX idx_asset_deleted (deleted)
);
-- Context Center Drive: File content snapshot table (revisions, extracted text).
CREATE TABLE IF NOT EXISTS context_file_content (
id VARCHAR(36) GENERATED ALWAYS AS (json ->> '$.id') STORED NOT NULL,
name VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.name') STORED NOT NULL,
nameHash VARCHAR(256) NOT NULL COLLATE ascii_bin,
json JSON NOT NULL,
updatedAt BIGINT UNSIGNED GENERATED ALWAYS AS (json ->> '$.updatedAt') STORED NOT NULL,
updatedBy VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.updatedBy') STORED NOT NULL,
deleted BOOLEAN GENERATED ALWAYS AS (json -> '$.deleted') STORED,
PRIMARY KEY (id),
UNIQUE KEY unique_context_file_content_name (nameHash),
INDEX idx_context_file_content_updated_at (updatedAt)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- Add tag_usage.metadata column if missing (newer tag usage payloads carry metadata).
SET @ddl = (
SELECT IF(
EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_schema = DATABASE()
AND table_name = 'tag_usage'
AND column_name = 'metadata'
),
'SELECT 1',
'ALTER TABLE tag_usage ADD COLUMN metadata JSON NULL'
)
);
PREPARE stmt FROM @ddl;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
-- Add audit_log_event.search_text column if missing (searchable audit log text).
SET @ddl = (
SELECT IF(
EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_schema = DATABASE()
AND table_name = 'audit_log_event'
AND column_name = 'search_text'
),
'SELECT 1',
'ALTER TABLE audit_log_event ADD COLUMN search_text LONGTEXT NULL'
)
);
PREPARE stmt FROM @ddl;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
-- Distributed reindex job tracking.
CREATE TABLE IF NOT EXISTS search_index_job (
id VARCHAR(64) NOT NULL,
status VARCHAR(64) NOT NULL,
jobConfiguration JSON NOT NULL,
targetIndexPrefix VARCHAR(256) NOT NULL,
stagedIndexMapping JSON DEFAULT NULL,
totalRecords BIGINT NOT NULL DEFAULT 0,
processedRecords BIGINT NOT NULL DEFAULT 0,
successRecords BIGINT NOT NULL DEFAULT 0,
failedRecords BIGINT NOT NULL DEFAULT 0,
stats JSON NOT NULL,
createdBy VARCHAR(256) NOT NULL,
createdAt BIGINT NOT NULL,
startedAt BIGINT DEFAULT NULL,
completedAt BIGINT DEFAULT NULL,
updatedAt BIGINT NOT NULL,
errorMessage LONGTEXT DEFAULT NULL,
registrationDeadline BIGINT DEFAULT NULL,
registeredServerCount INT DEFAULT NULL,
PRIMARY KEY (id),
KEY idx_search_index_job_status_created_at (status, createdAt DESC)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- Retry queue for failed search-index writes.
CREATE TABLE IF NOT EXISTS search_index_retry_queue (
entityId VARCHAR(64) NOT NULL,
entityFqn VARCHAR(700) NOT NULL,
failureReason LONGTEXT DEFAULT NULL,
status VARCHAR(64) NOT NULL,
entityType VARCHAR(128) NOT NULL,
retryCount INT NOT NULL DEFAULT 0,
claimedAt TIMESTAMP NULL DEFAULT NULL,
PRIMARY KEY (entityId, entityFqn),
KEY idx_search_index_retry_queue_status (status),
KEY idx_search_index_retry_queue_claimed_at (claimedAt)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
feat: add ContextMemory entity (Context Center memories) (#28224) * feat(spec): add ContextMemory + CreateContextMemory JSON schemas * feat(jdbi3): add ContextMemoryDAO * feat: register contextMemory entity type constant * feat(service): add ContextMemory repository, resource, mapper * feat(bootstrap): add context_memory table DDL * test(service): ContextMemory resource CRUD test * fix(context-memory): address review (relationship types, stable FQN, status msg, test name) - storeRelationships: rootMemory -> Relationship.CONTAINS, parentMemory -> Relationship.HAS so the root-ancestor and direct-parent hierarchies are distinguishable. - setFullyQualifiedName: derive from the immutable name only (drop mutable primaryEntity/owner derivation that destabilized nameHash on update). - validateStatusTransition: separate "no transitions defined" from "disallowed transition". - Rename ContextMemoryResourceTest -> ContextMemoryStatusTransitionTest (pure unit test). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(context-memory): add ContextMemoryIT + SDK ContextMemoryService Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(spec): register contextMemory in EntityLink.g4 ENTITY_TYPE grammar EntityLinkGrammarTest.testAllEntityTypesHaveGrammarOrExclusion enumerates every Entity.java constant and requires each to be in the EntityLink grammar or the test's exclusion list. ContextMemory is a normal EntityRepository-backed top-level entity (like learningResource / contextFile), so it belongs in the ENTITY_TYPE rule. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(context-memory): override owner ITs for creator-as-owner default ContextMemoryMapper.defaultOwners() intentionally assigns the creating user as owner when the create request omits owners. BaseEntityIT's patch_entityUpdateOwner_200 and patch_entityUpdateOwnerFromNull_200 assert "no owner initially" for any supportsOwners entity, so both failed for ContextMemory. Override both in ContextMemoryIT: keep the PATCH-replace-owner contract, change only the precondition to expect the creator as the sole initial owner (asserted by count, not a hardcoded principal). Mapper unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Update generated TypeScript types Add the generated ContextMemory TS types (entity/context/contextMemory.ts, api/context/createContextMemory.ts). The schemas were on the branch but their generated types were missing, failing the TypeScript Type Generation check on this fork PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(context-memory): address review (relationship cleanup, owner scope, validations) Copilot review on the ContextMemory entity: - #1 record primaryEntity/relatedEntities/root/parent/source*/machineRepresentation in version history; usageCount/lastUsedAt documented as untracked telemetry - #2 clear stale HAS/RELATED_TO/CONTAINS edges before re-adding in storeRelationships - #4 default creator as owner only on create; PUT without owners no longer silently replaces previously set owners - #5 schema documents that any status is allowed at creation; transitions enforced only on update - #6 setFullyQualifiedName via FullyQualifiedName.build with skip-if-set guard - #7 validate shared principal type is user/team/domain - #8 reject self-reference for parentMemory/rootMemory - #10 inline Entity.CONTEXT_MEMORY, drop redundant constant Regenerate ContextMemory TS types for the schema doc change; add IT coverage for the self-reference and invalid-shared-principal validations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(context-memory): don't blanket-delete relationships (domain data loss) The #2 cleanup via deleteTo(memory, CONTEXT_MEMORY, HAS, null) also matched the framework's domain --HAS--> memory edge (storeDomains runs before storeRelationships in storeRelationshipsInternal, on every create and update), silently dropping domain assignments. storeRelationships is now add-only (addRelationship upserts, so re-running on update is idempotent). Stale-edge cleanup moved to ContextMemoryUpdater using the framework's updateFromRelationship(s) helpers, which delete only the specific changed refs and record the version change. parentMemory now uses Relationship.PARENT_OF (distinct from primaryEntity's HAS and the framework's domain HAS) so the parent edge can be maintained without collision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(bootstrap): move context_memory DDL from 2.0.1 to 2.0.0 The context_memory table belongs in the 2.0.0 migration. Relocated the MySQL and Postgres DDL verbatim; the 2.0.1 schemaChanges.sql files are restored to their original task_migration_mapping-only content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(bootstrap): add ENGINE=InnoDB to context_memory MySQL DDL Explicit engine clause, consistent with the task/search-index tables in the same migration and robust to any server default change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(context-memory): preserve sanitized/validated fields; validate relatedEntities Review follow-ups: - ContextMemoryMapper no longer re-sets description/owners/domains/tags/displayName after copy(). copy() sanitizes description (stored-XSS) and validates owners and domains; re-setting the raw request values bypassed both. Only ContextMemory- specific fields are set now. - prepare() now assigns the result of EntityUtil.populateEntityReferences back onto relatedEntities so orphaned/invalid refs are filtered instead of persisted. - ContextMemoryIT Javadoc now references ContextMemoryRepository#setCreatorAsDefaultOwner (the defaultOwners mapper method no longer exists). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 16:10:46 +00:00
-- ContextMemory entity - reusable Context Center memory.
CREATE TABLE IF NOT EXISTS context_memory (
id VARCHAR(36) GENERATED ALWAYS AS (json ->> '$.id') STORED NOT NULL,
name VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.name') STORED NOT NULL,
nameHash VARCHAR(256) NOT NULL COLLATE ascii_bin,
json JSON NOT NULL,
updatedAt BIGINT UNSIGNED GENERATED ALWAYS AS (json ->> '$.updatedAt') STORED NOT NULL,
updatedBy VARCHAR(256) GENERATED ALWAYS AS (json ->> '$.updatedBy') STORED NOT NULL,
deleted BOOLEAN GENERATED ALWAYS AS (json -> '$.deleted') STORED,
PRIMARY KEY (id),
UNIQUE KEY unique_context_memory_name (nameHash),
INDEX idx_context_memory_updated_at (updatedAt)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;