* Add design doc for search indexing stats redesign
Covers:
- Simplified 4-stage pipeline model (Reader, Process, Sink, Vector)
- Per-entity index promotion instead of batch promotion
- Alias management from indexMapping.json
- Payload-aware vector bulk processor
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Add Support for Per Entity Index Promotion
* Add UI Bit
* Add Lang
* Add AppLog View Test coverage
* Add Bathced Vector index querying
* Add Improvements for Vector to be async and also stats to be better handled
* Use Virtual Thread
* Use Virtual Thread
* Fix Tests
* Make reading stats easier
* Fixed Stats to be accurate
* Fix Stats getting null
* Fix partition worker stats
* Fix Reader Stats - final
* Update generated TypeScript types
* Make updates in 1.12.0
* Revert "Use Virtual Thread"
This reverts commit 4eb23374d1.
* Revert "Use Virtual Thread"
This reverts commit efe8d03b5d.
* Reapply "Use Virtual Thread"
This reverts commit d59cde18b2.
* Reapply "Use Virtual Thread"
This reverts commit 769e5710c3.
* Fix Final Update on stat
* - Add atomic alias swap
- remove unnecessary migration
* Fix Sonar test jest
* Fix Final Update on stat
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Ensure columns are retrieved in the right order
This is because since introducing ordering for `getTableColumnsByFQN`, the patches created in `removeTagFromEntity` were open to pointing to different columns if the default order didn't match how they were persisted in db
* Allow exception list to be updated on all feedback
* Apply gitar comments
* Add `metadata` to `tag_usage` table
* Update JSON schema object to include `TagLabel.metadata`
* Apply feedback to selected recognizer
* Add backend integration tests
* Update `ingestion` to return `TagLabel.metadata.recognizer`
* Update generated TypeScript types
* Update generated TypeScript types
* Send recognizer result metadata in feedback approval task (#25485)
* Send `TagLabelRecognizerMetadata` in `TaskDetails`
This is so we can show an explanation behind the classification in the feedback approval card
* Update typescript types
* Run Spotless
* Ensure `applyTagsBatchInternal` works equally for pg and mysql
* Tag metadata fixes
* Fix CI test
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Rohit Jain <60229265+Rohit0301@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
* Fix Stats
* Add Warning logs and reindex failure analysis
* Add Search Insights in Preferences
* Add Label
* Fix Full Error not available
* Add check for reindex run
* FIX#24374 - Data Contract at Data Product level
* Update generated TypeScript types
* FIX#24374 - Data Contract at Data Product level
* fix DP page
* fix: preserve termsOfUse object format in filtered contract
The termsOfUse field was being converted to a string during filtering,
but the form components expect it to be an object with {content: string}.
This was causing test failures where form elements were not visible.
- Keep termsOfUse as object format when not inherited
- Convert old string format to new object format for consistency
- Fixes 21 test failures in DataContracts.spec.ts and DataContractInheritance.spec.ts
* fix: address code review findings - state sync and immutability
Frontend changes:
- Add useEffect to sync formValues with filteredContract changes
- Ensures edit form updates when contract prop changes
Backend changes:
- Create deep copy at start of mergeContracts() to avoid mutating input
- Prevents side effects if contract object is reused elsewhere
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
* Addressing feedback
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
* fix tests
* fix inherited contract delete and status
* fix inherited contract delete and status
* fix inherited contract execution in app
* fix test
* fix: resolve playwright postgresql ci test failure
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
* ci: fix yaml validation and checkstyle failures
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
* fix: correct JSON/YAML validation errors
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
* fix: resolve maven-collate and ui-coverage test failures
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
* gitar feedback
* fix ci
* fix ci
* fix ci
* fix ci
* include .claude
* validate
* fix playwright
* playwright
* fix playwright
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gitar <gitar@collate.io>
Co-authored-by: Gitar <noreply@gitar.ai>
Co-authored-by: pmbrull <pmbrull@users.noreply.github.com>
Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: karanh37 <karanh37@gmail.com>
* feat: added repository logic to list all versions (including latest) for a specific entity type
* feat: added list all versions for all the entity resources
* feat: moved endpoint to EntityResource
* feat: renamed endpoint to /history and methods to EntityHistory
* feat: ran java linting
* feat: remove v1 implementation left over code
* feat: fix failing tests
* feat: ran klinting
* feat: fix psql query
* feat: address PR comments
* feat: ran klinting
* feat: increase cache duration
* feat: address query edge cases
* fix(dq): psql migration for row insert test parameters
* fix(dq): use name and add trailing new line
* Fix description formatting in postDataMigrationSQLScript.sql
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add DQ Rules Library
* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list
* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list
* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list
* Update generated TypeScript types
* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list
* Add DQ Rules Library - Add Tests and enable testDefinitions through APIs to list
* Update generated TypeScript types
* Refactor tests to use toStrictEqual for string comparisons and improve consistency
- Updated various test files to replace `toBe` with `toStrictEqual` for string assertions in ImportStatus, SummaryCard, TabsLabel, and others.
- Enhanced regex tests to ensure accurate validation of entity names and tags.
- Added new translations for test platform warnings in en-us.json.
- Improved utility tests for alerts, authentication, CSV handling, and task messages to use `toEqual` for better clarity.
* Refactor TestDefinitionForm and TestDefinitionList components to use updated API methods and improve SQL expression handling
* Enhance TestDefinitionList component with permission checks for edit and delete actions, and update tests to reflect changes in permission handling
* Remove debug log from handleSubmit in TestDefinitionForm component
* Add permission loading state and enhance permission handling in TestDefinitionList component
* Update generated TypeScript types
* Update generated TypeScript types
* Update generated TypeScript types
* fix build failure
* Revert "Update generated TypeScript types"
This reverts commit 67b062216f.
* Enhance TestDefinitionForm and TestDefinitionList components with improved UI and pagination handling
* fix: update RulesLibrary tests and enhance TestDefinitionForm styling
* fix: Enhance TestDefinitionForm with error handling and improved UX
* fix: Update test definition handling and improve rendering in TestDefinitionList
* fix: Refactor TestDefinitionPermissions tests for improved permission checks and API context handling
* fix: Update system test definition retrieval to use findLast for improved accuracy
* feat: Add end-to-end tests for Rules Library and Test Definition Permissions
* fix: Update edit button visibility check to use beDisabled for better clarity
* fix: Refactor response handling in TestDefinitionPermissions tests for improved reliability
* move migrations execution order
* fix: remove existing columns
* style: remove migration extra line break
* chore: fix migration
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
Co-authored-by: TeddyCr <teddy.crepineau@gmail.com>
* Added `appliedAt` field to `TagLabel`s
This is to track insertions to `tag_usage` with timestamps
* Capture and format recognition explanations in `TagAnalyzer`
This creates a function to build an explanation to why something was scored as it was.
# Conflicts:
# ingestion/src/metadata/pii/algorithms/presidio_utils.py
* Refactor `TagProcessor`
* Capture results for the old-style `PIIProcessor`
* Move strings to constants
* Add `TagLabel.appliedBy` field
This change also patches the user's name into the tags declared in `JsonPatch` objects to fill it up
* Update typescript types
* Fix python tests
* Fix java tests
* Simplify setting tag's `appliedBy` using `EntityUpdater.updatingUser`
* Remove unnecessary f-string
* Moar fixes
* Move migrations to 1.11.5
* Fix#23853: AI Governance and Compliance Framework for AI Applications
* Update generated TypeScript types
* Update generated TypeScript types
* trigger ci
* Fix#23853: AI Governance and Compliance Framework for AI Applications
* Fix test failures
* Merge origin/main into ai_agents - added pipeline execution features and resolved conflicts
* Update generated TypeScript types
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
* Oh boy, factory-boy
Created a bunch of `factory-boy` factories that help creating mock test data easily
* Update `try_bind` docker utility to ease debugging
* Resolve conflicts between `Classification` tags
* Refactor `TagClassifier` into another entity
This is so:
1. We're not tied to the `ColumnClassifier` interface that forced returning `Mapping[T, float]` (unnecessary since we're returning `List[ScoredTag]`
2. The tag analyzer uses the same `recognizer_factories` registry we used for `PIIProcessor`
3. Create a separate service that abstracts using `TagScorer` and `TagAnalyzer` to return `TagScore`s (makes testing upstream code easier)
* Interface to retrieve available `Tag`s and `Classification`s
* Refactor `TagProcessor` to support multi-classification
- Depends `ClassificationManagerInterface` to retrieve `Tag`s and `Classification`s
- Uses a callable dependency to score tags for a column
- Accepts a classification filter parameter
- Leverages `ConflictResolver` to resolve conflicts between tags of the same `Classification`
* Add an integration test for the `TagProcessor`
* Ensure `PII` classification is configured with migrations
# Conflicts:
# bootstrap/sql/migrations/native/1.11.1/mysql/postDataMigrationSQLScript.sql
# bootstrap/sql/migrations/native/1.11.1/postgres/postDataMigrationSQLScript.sql
* Move `FakeClassificationManager` to `_openmetadata_testutils`
This is because importing from `tests` breaks in the CI when running pytests from the root of the repo
* Fix broken mutually exclusive classifications
This is because the implementation did not take into account previous tags when resolving conflicts.
This caused that running the classifier twice for a classification, with a mutually exclusive configuration, would end up breaking the exclusivity
* fix: add supportedServcices for relavnt service DQ display
data diff is not supported by all services. We need to only
display it on supported services
* fix: added query param and create filed
* Revert deleting Old Deployments for Periodic batch Workflows
* Revert "Revert deleting Old Deployments for Periodic batch Workflows"
This reverts commit 7bd1be5a81.
* TRUNCATE FLOWABLE history tables in both 1.10.5 and 1.10.7 migrations
* TRUNCATE FLOWABLE history tables in both 1.10.5 and 1.10.7 migrations
* fix: migration
* fix: playwright test DBT -> dbt
* feat: added rentention for profile and dq data
* feat: fix failing tests
* feat: address error in postgres delete sql
* feat: fixed missing parameter in psql query
* fix: added the deletion step in test case
* feat: fixed postgres query for deletion before cutoffs
* Fix: Search Slowness when painless scripts aggregates for terms and classifications
* Fix Sql
* Add fields to security service index
---------
Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
* Remove data placed in the wrong directory
* Update `MigrationUtil` to use data from `piiTagsWithRecognizers`
That way we can also remove duplicate json and have a single source of data
* Update migration queries to use prepared statements
* Minor fix in the `piiTagsWithRecognizers` definitions
* Initial implementation for Dimensionality on Data Quality Tests
* Fix ColumnValuesToBeUnique and create TestCaseResult API
* Refactor dimension result
* Initial E2E Implementation without Impact Score
* Dimensionality Thin Slice
* Update generated TypeScript types
* Update generated TypeScript types
* Removed useless method to use the one we already had
* Fix Pandas Dimensionality checks
* Remove useless comments
* Implement PR comments, fix Tests
* Improve the code a bit
* Fix imports
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>