OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
Find a file
Sriharsha Chintalapani a4998bc1c7
Continuous indexing to handle failures (#26111)
* Add Continuous Indexing

* Add continuous Search indexing

* Update to 1.12.3

* Make search index retry queue reliable with stale recovery, health checks, and silent failure coverage

  - Add entityType, retryCount, claimedAt columns to search_index_retry_queue table
  - Implement stale IN_PROGRESS recovery (10min threshold, 60s sweep interval)
  - Replace static isClientAvailable flag with cached ping health check (5s TTL)
  - Narrow catch blocks in resolveById/resolveByFqn to EntityNotFoundException
  - Use entityType hint for O(1) entity resolution instead of scanning all types
  - Switch from status-string-based retry to retryCount-based (< 3 retries → PENDING, ≥ 3 → FAILED)
  - Batch cascade reindex at 200 entities instead of accumulating up to 5000
  - Add retry queue enqueue in catch blocks of createTimeSeriesEntity, updateTimeSeriesEntity,
    deleteTimeSeriesEntityById, bulkIndexPipelineExecutions, reindexAcrossIndices, and
    TestSuiteRepository.postCreate
  - Re-throw exceptions from indexTableColumns/deleteTableColumns to parent catch blocks
  - Add Micrometer counters for enqueued, processed (success/failure), and stale recovered

* Add missing lineage call site and Add test

* Review comments

* Add resilience to search index retry worker: client availability checks, backoff, and error classification

  - Add exponential backoff when search client is unreachable so the
    worker does not burn retries during cluster outages (5s → 10s → … → 60s cap)
  - Classify errors using HTTP status codes from ES/OS exceptions:
    4xx (except 429) are non-retryable and skip straight to FAILED;
    429, 5xx, and IOException are retryable
  - Preserve first bulk failure detail in RuntimeException so error
    classification works for the bulk indexing path
  - Reorganize SearchIndexRetryWorker into clearly separated sections
    (lifecycle, main loop, record processing, entity resolution,
    reindexing, resilience, suspension, utilities)
  - Add isRetryableStatusCode utility to SearchIndexRetryQueue
  - Add integration tests: status code classification, retry exhaustion
    to FAILED, recovery from PENDING_RETRY_1, error detail preservation

* Address review comments

* Revert fqn size

* Spotless

* Address volatile review comments

* Fix Failing Test

* update review comments

---------

Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
Co-authored-by: Mohit Yadav <105265192+mohityadav766@users.noreply.github.com>
2026-03-18 16:23:04 +05:30
.claude/skills Add eslint-plugin-playwright enforcement with CI check (#26494) 2026-03-16 18:12:12 +05:30
.github Fix MCP tools entity status (#26567) 2026-03-18 11:04:00 +01:00
bin Reindex Work - Perf , Metrics , Benchmarking and More (#26231) 2026-03-10 08:10:46 +05:30
bootstrap Continuous indexing to handle failures (#26111) 2026-03-18 16:23:04 +05:30
common Fix Metrics collection; reduce no.of metrics; improve slow request lo… (#25751) 2026-03-13 13:38:31 -07:00
conf Fix Metrics collection; reduce no.of metrics; improve slow request lo… (#25751) 2026-03-13 13:38:31 -07:00
docker Improve indexing (#26154) 2026-03-03 16:39:27 +05:30
docs Glossary relations (#25886) 2026-03-18 10:51:03 +05:30
examples/python-sdk/data-quality Create documentation resources for Data Quality as Code (closes #23800) (#24169) 2025-11-11 10:25:42 +00:00
ingestion Glossary relations (#25886) 2026-03-18 10:51:03 +05:30
openmetadata-airflow-apis ISSUE #20036 - sqlalchemy 2.0 migration (#26031) 2026-03-02 13:07:47 -08:00
openmetadata-clients Deprecate OpenMetadata Java client in favor of new Java SDK (#26388) 2026-03-10 21:30:39 -07:00
openmetadata-dist Deprecate OpenMetadata Java client in favor of new Java SDK (#26388) 2026-03-10 21:30:39 -07:00
openmetadata-integration-tests Continuous indexing to handle failures (#26111) 2026-03-18 16:23:04 +05:30
openmetadata-k8s-operator MINOR - Add Operator Tests (#25343) 2026-01-19 14:20:37 +01:00
openmetadata-mcp Fix MCP tools entity status (#26567) 2026-03-18 11:04:00 +01:00
openmetadata-sdk Glossary relations (#25886) 2026-03-18 10:51:03 +05:30
openmetadata-service Continuous indexing to handle failures (#26111) 2026-03-18 16:23:04 +05:30
openmetadata-shaded-deps Reduced version to 3.4 (#26017) 2026-02-20 19:28:21 +05:30
openmetadata-spec Glossary relations (#25886) 2026-03-18 10:51:03 +05:30
openmetadata-ui fix: adjust recently viewed item width and box-sizing to prevent overlap (#26564) 2026-03-18 11:55:56 +05:30
openmetadata-ui-core-components update alert props (#26544) 2026-03-17 15:31:13 +05:30
scripts Add skills to build connectors (#26309) 2026-03-08 21:45:10 -07:00
skills Add SSRS connector (#26310) 2026-03-10 11:12:23 +05:30
.git-blame-ignore-revs Minor: update git-blmae-ignore-revs, and uncomment ClassificationResourceTest tests code (#14431) 2023-12-18 19:16:29 -08:00
.gitignore Add SSRS connector (#26310) 2026-03-10 11:12:23 +05:30
.nojekyll
.pre-commit-config.yaml feature/pii-processor-improvement (#21248) 2025-05-19 17:52:17 +00:00
.pylintrc ISSUE #21101 - Implement BQ Partitioned Tests (#21348) 2025-05-22 17:22:05 +02:00
.snyk Ignore _openmetadata_testutils from snyk (#21168) 2025-05-13 18:01:05 +05:30
APPLICATION.md Rename app 'preview' property to 'enabled' (#26170) 2026-03-05 08:29:54 +01:00
CLAUDE.md Docs: Mandate openmetadata-ui-core-components as the UI standard in CLAUDE.md (#26379) 2026-03-11 11:08:15 +05:30
CODE_OF_CONDUCT.md
CONTRIBUTING.md addded more detail on issue creation in contributors page (#16583) 2024-06-09 14:02:36 -07:00
generate_ts.sh Feature: Generate TS From JSON (#19823) 2025-02-25 18:18:02 +05:30
INCIDENT_RESPONSE.md Add threat model and incident response (#23603) 2025-09-28 13:17:23 -07:00
LICENSE
Makefile MINOR: Add OpenAPI version update functionality in Makefile and script (#24604) 2026-01-14 14:11:56 +05:30
NOTICE
package.json chore(ui): bump quicktype to resolve vulnerabilities (#17979) 2024-09-25 15:09:34 +05:30
pom.xml Fix: Upgrade MCP SDK to 1.1.0 (#26489) 2026-03-17 15:01:04 +01:00
README.md Update README.md for column-level consistency (#24670) 2025-12-03 07:59:18 -08:00
SECURITY.md Update vulnerability reporting instructions in SECURITY.md (#25651) 2026-01-30 14:03:09 -08:00
tests.txt Implement Modern Fluent API Pattern for OpenMetadata Java Client (#23239) 2025-09-29 16:07:02 -07:00
THREAT_MODEL.md Add threat model and incident response (#23603) 2025-09-28 13:17:23 -07:00
yarn.lock Chore(deps): Bump minimatch from 3.1.2 to 3.1.5 (#26157) 2026-03-03 06:22:12 +00:00



Logo

Empower your Data Journey with OpenMetadata

Commit Activity Release

What is OpenMetadata?

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column-level lineage, and seamless team collaboration. It is one of the fastest-growing open-source projects with a vibrant community and adoption by a diverse set of companies in a variety of industry verticals. Based on Open Metadata Standards and APIs, supporting connectors to a wide range of data services, OpenMetadata enables end-to-end metadata management, giving you the freedom to unlock the value of your data assets.


Contents:

OpenMetadata Consists of Four Main Components:

  • Metadata Schemas: These are the core definitions and vocabulary for metadata based on common abstractions and types. They also allow for custom extensions and properties to suit different use cases and domains.
  • Metadata Store: This is the central repository for storing and managing the metadata graph, which connects data assets, users, and tool-generated metadata in a unified way.
  • Metadata APIs: These are the interfaces for producing and consuming metadata, built on top of the metadata schemas. They enable seamless integration of user interfaces and tools, systems, and services with the metadata store.
  • Ingestion Framework: This is a pluggable framework for ingesting metadata from various sources and tools to the metadata store. It supports about 84+ connectors for data warehouses, databases, dashboard services, messaging services, pipeline services, and more.

Key Features of OpenMetadata

Data Discovery: Find and explore all your data assets in a single place using various strategies, such as keyword search, data associations, and advanced queries. You can search across tables, topics, dashboards, pipelines, and services.

12


Data Collaboration: Communicate, converse, and cooperate with other users and teams on data assets. You can get event notifications, send alerts, add announcements, create tasks, and use conversation threads.

11


Data Quality and Profiler: Measure and monitor the quality with no-code to build trust in your data. You can define and run data quality tests, group them into test suites, and view the results in an interactive dashboard. With powerful collaboration, make data quality a shared responsibility in your organization.

8


Data Governance: Enforce data policies and standards across your organization. You can define data domains and data products, assign owners and stakeholders, and classify data assets using tags and terms. Use powerful automation features to auto-classify your data.

10


Data Insights and KPIs: Use reports and platform analytics to understand how your organization's data is doing. Data Insights provides a single-pane view of all the key metrics to reflect the state of your data best. Define the Key Performance Indicators (KPIs) and set goals within OpenMetadata to work towards better documentation, ownership, and tiering. Alerts can be set against the KPIs to be received on a specified schedule.

9


Data Lineage: Track and visualize the origin and transformation of your data assets end-to-end. You can view column-level lineage, filter queries, and edit lineage manually using a no-code editor.

Data Documentation: Document your data assets and metadata entities using rich text, images, and links. You can also add comments and annotations and generate data dictionaries and data catalogs.

Data Observability: Monitor the health and performance of your data assets and pipelines. You can view metrics such as data freshness, data volume, data quality, and data latency. You can also set up alerts and notifications for any anomalies or failures.

Data Security: Secure your data and metadata using various authentication and authorization mechanisms. You can integrate with different identity providers for single sign-on and define roles and policies for access control.

Webhooks: Integrate with external applications and services using webhooks. You can register URLs to receive metadata event notifications and integrate with Slack, Microsoft Teams, and Google Chat.

Connectors: Ingest metadata from various sources and tools using connectors. OpenMetadata supports about 84+ connectors for data warehouses, databases, dashboard services, messaging services, pipeline services, and more.

Try our Sandbox

Take a look and play with sample data at http://sandbox.open-metadata.org

Install and Run OpenMetadata

Get up and running in a few minutes. See the OpenMetadata documentation for installation instructions.

Documentation and Support

We're here to help and make OpenMetadata even better! Check out OpenMetadata documentation for a complete description of OpenMetadata's features. Join our Slack Community to get in touch with us if you want to chat, need help, or discuss new feature requirements.

Contributors

We ❤️ all contributions, big and small! Check out our CONTRIBUTING guide to get started, and let us know how we can help.

Don't want to miss anything? Give the project a 🚀

A HUGE THANK YOU to all our supporters!

Stargazers

Stargazers of @open-metadata/OpenMetadata repo

License

OpenMetadata is released under Apache License, Version 2.0