Elgato_dark/OpenMetadata: OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

mirror of https://github.com/open-metadata/OpenMetadata synced 2026-05-24 09:39:11 +00:00

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

Find a file

Ram Narayan Balaji d2c64d8ac7 fix(workflows): make Flowable schema upgrades idempotent to survive partial migrations (#27234 ) * fix(workflows): make Flowable schema upgrades idempotent to survive partial migrations Fixes #26048. When the server crashed mid-startup during a Flowable schema upgrade, the DB was left in a partially-migrated state. On restart, Flowable re-ran the same DDL and failed on already-existing objects (indexes, tables, columns), permanently wedging both the server and migrate --force. Changes: 1. WorkflowHandler: webserver now uses DB_SCHEMA_UPDATE_FALSE — it validates the schema but never runs DDL. Only migrate CLI uses DB_SCHEMA_UPDATE_TRUE. 2. OpenMetadataOperations: explicit WorkflowHandler.initialize(config, true) inside the migrate command so Flowable DDL always runs during migration. 3. WorkflowHandler: catches FlowableWrongDbException on webserver startup and rethrows with an actionable message directing the operator to run migrate. 4. IdempotentDdlDataSource + IdempotentDdlStatement: JDBC DataSource wrapper used exclusively in migration context. Intercepts execute(sql) for CREATE INDEX, CREATE TABLE, and ALTER TABLE ADD COLUMN and pre-checks existence via standard DatabaseMetaData (getIndexInfo, getTables, getColumns) before executing. If the object already exists it logs a skip and returns — no SQL state codes, no string matching, works on MySQL and PostgreSQL. Unit tests cover schema-update mode selection in both contexts. * fix(workflows): address review comments on idempotent DDL wrapper - Extract shouldSkip() helper; apply idempotency checks to all execute() and executeUpdate() overloads, not just execute(String) - Tighten ALTER TABLE regex with negative lookahead to exclude SQL keywords (CONSTRAINT, PRIMARY, UNIQUE, FOREIGN, CHECK, INDEX, KEY) from being matched as column names - IdempotentDdlDataSource now wraps a DataSource delegate instead of calling DriverManager directly; uses migrationDataSource() helper in WorkflowHandler to resolve from existing DataSource or JDBC params - Fix InvocationTargetException wrapping in Connection proxy — unwrap cause so callers receive the original SQLException - Wrap all createStatement() variants in the proxy, not just the no-arg form - Contextual error message in WorkflowHandler — distinguish between server startup and migration context - Add IdempotentDdlStatementTest: 11 tests covering skip/execute for CREATE INDEX, CREATE UNIQUE INDEX, CREATE TABLE, ALTER TABLE ADD COLUMN, keyword-guarded ALTER TABLE, executeUpdate overload, and pass-through * fix(workflows): include DB/library versions in FlowableWrongDbException message * test(workflows): add IdempotentDdlDataSourceTest for proxy wrapping and exception surfacing * test(workflows): assert exception identity in proxy exception-surfacing tests * fix(workflows): catalog-aware identifier normalization in IdempotentDdlStatement On MySQL with lower_case_table_names=0 (default on Linux), table names are stored as-is and catalog=null metadata lookups can miss existing objects. - Use connection.getCatalog() for all getIndexInfo/getTables/getColumns calls - Normalize identifiers via DatabaseMetaData.storesLowerCaseIdentifiers() / storesUpperCaseIdentifiers() instead of unconditional toLowerCase() - stripIdentifierQuotes() handles backtick, double-quote and bracket quoting - extractObjectName() handles schema-qualified names (schema.table) - columnExists now iterates and normalizes COLUMN_NAME from ResultSet - Test: added MySQL uppercase storage case to IdempotentDdlStatementTest * fix(workflows): null guard in shouldSkip, drop-create Flowable init, robust test indexing - shouldSkip() returns false immediately for null SQL, preserving JDBC contract (delegate handles null and throws the driver's own error) - drop-create command now calls WorkflowHandler.initialize(config, true) after native migrations so it produces a fully startable DB including Flowable tables - WorkflowHandlerSchemaUpdateTest: replace brittle get(1) with getLast() so the test is not sensitive to how many StandaloneProcessEngineConfiguration instances are constructed before initializeNewProcessEngine runs		2026-04-16 15:46:21 +00:00
.claude	feat(ingestion): add connector-audit skill for reliability audits (#26992 )	2026-04-05 13:00:43 +02:00
.devcontainer	MINOR - DevContainer Setup for contribution (#26623 )	2026-03-20 08:27:30 +01:00
.github	Chore(UI): consolidated UI checkstyle fix commands and modify workflow comment (#27402 )	2026-04-16 17:18:22 +05:30
bin	Set Indexing related executor threads priority to LOW (#27153 )	2026-04-15 11:28:47 -07:00
bootstrap	fix(migration): revert webhook authType back to secretKey in v1126 and remove broken v1125 migration (#27427 )	2026-04-16 14:03:08 +00:00
common	ISSUE #20212 - TestCase DP Propagation + Search Index Propagation Refactor & Issue (#26901 )	2026-04-03 17:32:53 +00:00
conf	fix(ci): validate yaml workflow failing (#27391 )	2026-04-15 11:24:52 +00:00
docker	RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex (#26902 )	2026-04-14 13:24:41 -07:00
docs	RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex (#26902 )	2026-04-14 13:24:41 -07:00
examples/python-sdk/data-quality	Create documentation resources for Data Quality as Code (closes #23800 ) (#24169 )	2025-11-11 10:25:42 +00:00
ingestion	Fixes #24636 : use test_metadata.kwargs['model'] to identify primary table for dbt test entity links (#27366 )	2026-04-16 17:08:30 +05:30
openmetadata-airflow-apis	Fixes #25345 : Fix Airflow 3.x CSRF exempt no-op in all route files (#27056 )	2026-04-06 16:56:04 +05:30
openmetadata-clients	Deprecate OpenMetadata Java client in favor of new Java SDK (#26388 )	2026-03-10 21:30:39 -07:00
openmetadata-dist	Deprecate OpenMetadata Java client in favor of new Java SDK (#26388 )	2026-03-10 21:30:39 -07:00
openmetadata-integration-tests	Fixes #26945 : Refactor SearchIndex classes (#26947 )	2026-04-14 10:19:06 -07:00
openmetadata-k8s-operator	Fix omjob pod/label naming length constraints (#27143 )	2026-04-16 16:17:25 +02:00
openmetadata-mcp	RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex (#26902 )	2026-04-14 13:24:41 -07:00
openmetadata-sdk	Fix column filtering on Lineage (#25353 )	2026-04-06 09:01:15 -07:00
openmetadata-service	fix(workflows): make Flowable schema upgrades idempotent to survive partial migrations (#27234 )	2026-04-16 15:46:21 +00:00
openmetadata-shaded-deps	Reduced version to 3.4 (#26017 )	2026-02-20 19:28:21 +05:30
openmetadata-spec	Fix payload size issue (#27388 )	2026-04-15 13:29:26 +02:00
openmetadata-ui	test:Added missing test for ontology (#27423 )	2026-04-16 15:43:51 +00:00
openmetadata-ui-core-components	feat(ui): add font-size support to SelectItem via SelectContext (#27379 )	2026-04-15 16:43:01 +05:30
perf-tests	Fix column filtering on Lineage (#25353 )	2026-04-06 09:01:15 -07:00
scripts	Add Unit Tests coverage (#26360 )	2026-03-23 16:17:15 +01:00
skills	feat(ingestion): add connector-audit skill for reliability audits (#26992 )	2026-04-05 13:00:43 +02:00
.dockerignore	RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex (#26902 )	2026-04-14 13:24:41 -07:00
.git-blame-ignore-revs	Minor: update git-blmae-ignore-revs, and uncomment ClassificationResourceTest tests code (#14431 )	2023-12-18 19:16:29 -08:00
.gitignore	Add changeSummary API endpoint and UI components (#26533 )	2026-04-09 23:05:52 -07:00
.nojekyll	shahsank3t published a site update	2021-08-04 06:23:29 +00:00
.pre-commit-config.yaml	ci: ui checkstyle workflow in favour to remove pre-commit (#26445 )	2026-03-20 19:02:12 +05:30
.pylintrc	ISSUE #21101 - Implement BQ Partitioned Tests (#21348 )	2025-05-22 17:22:05 +02:00
.snyk	Ignore _openmetadata_testutils from snyk (#21168 )	2025-05-13 18:01:05 +05:30
APPLICATION.md	Rename app 'preview' property to 'enabled' (#26170 )	2026-03-05 08:29:54 +01:00
CLAUDE.md	MINOR: Handle case sensitivity for table constraints (#27244 )	2026-04-10 17:30:40 +02:00
CODE_OF_CONDUCT.md	Fix #412 - Add code of conduct for OpenMetadata community	2021-09-06 18:57:17 -07:00
CONTRIBUTING.md	addded more detail on issue creation in contributors page (#16583 )	2024-06-09 14:02:36 -07:00
DEVELOPER.md	Add developer skills for OpenMetadata (#26836 )	2026-03-31 16:15:27 -07:00
generate_ts.sh	Feature: Generate TS From JSON (#19823 )	2025-02-25 18:18:02 +05:30
INCIDENT_RESPONSE.md	Add threat model and incident response (#23603 )	2025-09-28 13:17:23 -07:00
LICENSE	OpenMetadata snapshot release 0.3	2021-08-01 14:27:44 -07:00
Makefile	Chore(UI): consolidated UI checkstyle fix commands and modify workflow comment (#27402 )	2026-04-16 17:18:22 +05:30
NOTICE	OpenMetadata snapshot release 0.3	2021-08-01 14:27:44 -07:00
package.json	fix: Resolve frontend security vulnerabilities in lodash and lodash-es (#27105 )	2026-04-07 07:55:25 +00:00
pom.xml	Fix(Security): Upgrade netty-bom to 4.1.132.Final to address CVE-2026-33870 and CVE-2026-33871 (#26938 )	2026-04-01 16:13:14 +00:00
README.md	Update README.md for column-level consistency (#24670 )	2025-12-03 07:59:18 -08:00
SECURITY.md	Update vulnerability reporting instructions in SECURITY.md (#25651 )	2026-01-30 14:03:09 -08:00
THREAT_MODEL.md	Add threat model and incident response (#23603 )	2025-09-28 13:17:23 -07:00
yarn.lock	fix: Resolve frontend security vulnerabilities in lodash and lodash-es (#27105 )	2026-04-07 07:55:25 +00:00

README.md

Empower your Data Journey with OpenMetadata

What is OpenMetadata?

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column-level lineage, and seamless team collaboration. It is one of the fastest-growing open-source projects with a vibrant community and adoption by a diverse set of companies in a variety of industry verticals. Based on Open Metadata Standards and APIs, supporting connectors to a wide range of data services, OpenMetadata enables end-to-end metadata management, giving you the freedom to unlock the value of your data assets.

Contents:

Features
Try our Sandbox
Install & Run
Roadmap
Documentation and Support
Contributors

OpenMetadata Consists of Four Main Components:

Metadata Schemas: These are the core definitions and vocabulary for metadata based on common abstractions and types. They also allow for custom extensions and properties to suit different use cases and domains.
Metadata Store: This is the central repository for storing and managing the metadata graph, which connects data assets, users, and tool-generated metadata in a unified way.
Metadata APIs: These are the interfaces for producing and consuming metadata, built on top of the metadata schemas. They enable seamless integration of user interfaces and tools, systems, and services with the metadata store.
Ingestion Framework: This is a pluggable framework for ingesting metadata from various sources and tools to the metadata store. It supports about 84+ connectors for data warehouses, databases, dashboard services, messaging services, pipeline services, and more.

Key Features of OpenMetadata

Data Discovery: Find and explore all your data assets in a single place using various strategies, such as keyword search, data associations, and advanced queries. You can search across tables, topics, dashboards, pipelines, and services.

Data Collaboration: Communicate, converse, and cooperate with other users and teams on data assets. You can get event notifications, send alerts, add announcements, create tasks, and use conversation threads.

Data Quality and Profiler: Measure and monitor the quality with no-code to build trust in your data. You can define and run data quality tests, group them into test suites, and view the results in an interactive dashboard. With powerful collaboration, make data quality a shared responsibility in your organization.

Data Governance: Enforce data policies and standards across your organization. You can define data domains and data products, assign owners and stakeholders, and classify data assets using tags and terms. Use powerful automation features to auto-classify your data.

Data Insights and KPIs: Use reports and platform analytics to understand how your organization's data is doing. Data Insights provides a single-pane view of all the key metrics to reflect the state of your data best. Define the Key Performance Indicators (KPIs) and set goals within OpenMetadata to work towards better documentation, ownership, and tiering. Alerts can be set against the KPIs to be received on a specified schedule.

Data Lineage: Track and visualize the origin and transformation of your data assets end-to-end. You can view column-level lineage, filter queries, and edit lineage manually using a no-code editor.

Data Documentation: Document your data assets and metadata entities using rich text, images, and links. You can also add comments and annotations and generate data dictionaries and data catalogs.

Data Observability: Monitor the health and performance of your data assets and pipelines. You can view metrics such as data freshness, data volume, data quality, and data latency. You can also set up alerts and notifications for any anomalies or failures.

Data Security: Secure your data and metadata using various authentication and authorization mechanisms. You can integrate with different identity providers for single sign-on and define roles and policies for access control.

Webhooks: Integrate with external applications and services using webhooks. You can register URLs to receive metadata event notifications and integrate with Slack, Microsoft Teams, and Google Chat.

Connectors: Ingest metadata from various sources and tools using connectors. OpenMetadata supports about 84+ connectors for data warehouses, databases, dashboard services, messaging services, pipeline services, and more.

Try our Sandbox

Take a look and play with sample data at http://sandbox.open-metadata.org

Install and Run OpenMetadata

Get up and running in a few minutes. See the OpenMetadata documentation for installation instructions.

Documentation and Support

We're here to help and make OpenMetadata even better! Check out OpenMetadata documentation for a complete description of OpenMetadata's features. Join our Slack Community to get in touch with us if you want to chat, need help, or discuss new feature requirements.

Contributors

We ❤️ all contributions, big and small! Check out our CONTRIBUTING guide to get started, and let us know how we can help.

Don't want to miss anything? Give the project a ⭐ 🚀

A HUGE THANK YOU to all our supporters!

Stargazers

License

OpenMetadata is released under Apache License, Version 2.0