OpenMetadata/ingestion/tests/cli_e2e/test_cli_postgres.py

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

150 lines
4.2 KiB
Python
Raw Permalink Normal View History

# Copyright 2022 Collate
# Licensed under the Collate Community License, Version 1.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/LICENSE
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Test Postgres connector with CLI
"""
chore(ingestion): migrate to ruff for format + isort + unused-import (#27739) * chore(ingestion): replace black/isort/pycln with ruff - Swap formatter + import-sorter + unused-import tooling for ruff (line-length 120, target py3.10) in ingestion + openmetadata-airflow-apis - Drop dead [tool.mypy] config; basedpyright is the active type checker - Bump requires-python to >=3.10 to match noxfile and CLAUDE.md (3.9 is documented as broken on Mac in noxfile.py) - Bump pre-commit-hooks v2.3 -> v5.0; the new check-json catches four pre-existing JSON issues now excluded with an inline TODO - Update Makefile py_format / py_format_check targets to call ruff * chore(ingestion): grandfather ruff lint violations and apply ruff format - 253 noqa markers added via 'ruff check --add-noqa' across 128 files, freezing existing violations so this PR is a tooling-only swap. Per-rule cleanup tracked in the TODO comment in ingestion/pyproject.toml. - Bulk reformat from black 22.3 -> ruff format @ line-length 120. Cosmetic only: imports balanced (-32/+32), structural keywords balanced (-2221/+2221), no logic changes. - Star-import rules (F403/F405) globally ignored; refactoring wildcard imports across connectors is a separate effort. * chore(ingestion): fix pylint findings surfaced after ruff format - filters.py: drop redundant parens around re.match(...) in `if` (C0325 superfluous-parens) — exposed when ruff format unwrapped them - nosql_adaptor.py: move `# pylint: disable=unused-argument` from the `column:` line to the `def` line so it covers `table` too (W0613) — scope was line-based, lost when ruff split params onto multiple lines - action1xx.py: replace `arguments-differ` with `signature-differs` in the disable directive (was always wrong code) and drop the now-useless `unused-argument` suppression (I0021) * fix(ingestion): make ruff extend-exclude robust to multi-root invocations CI's `make py_format_check` runs from the repo root and passes both `ingestion/` and `./openmetadata-airflow-apis/` to ruff in a single invocation. With multiple root paths, ruff's parallel file discovery races on extend-exclude matching against the project root, so files under `ingestion/src/metadata/generated/` were intermittently scanned and produced ~830 I001 violations. 20-run repro: 10/20 fail without the fix, 20/20 pass with the fix. Each excluded directory now appears twice in extend-exclude: - the project-root-relative pattern (cwd = ingestion/) - the prefixed pattern (cwd = repo root, multi-root invocation) * chore(ingestion): address gitar-bot findings + cross-version pylint disable - openmetadata-airflow-apis/pyproject.toml: switch coverage to module-name source + [tool.coverage.paths] glob remap (matches the ingestion pattern). Drops the hardcoded `env/lib/python3.9/site-packages/...` source path, which broke after the requires-python bump to 3.10. (Finding 1) - ingestion/setup.py: remove dead python_version<'3.9' / >='3.9' guards on mysql-connector-python and testcontainers; promote locust to a regular test dep (was conditionally added under sys.version_info >= (3, 9)). Also remove the now-unused `import sys`. (Finding 3) - ingestion/src/metadata/great_expectations/action1xx.py: cover both arguments-differ (great_expectations 0.18.x parent) and signature-differs (great_expectations 1.x parent) in the pylint disable comment, since CI installs 0.18.x and local often has 1.x. unused-argument covers the unused action_context. The opposite rule fires as I0021 useless-suppression on each environment, which is informational and does not affect pylint's exit code.
2026-04-27 08:05:28 +00:00
chore(ingestion): drop pylint, expand ruff (#27774) * chore(ingestion): drop pylint, expand ruff to Stage 2c Replace pylint with a coherent ruff-only stack (Stage 2c of the modernize roadmap). Pylint is dropped from dev deps and CI workflows; ruff selected ruleset expanded to ~22 families covering style, bug catchers, hygiene, and the pylint port (PLE/PLC/PLW/PLR with the noisy "too-many-X" complexity caps + magic-value disabled). What's selected (with rationale in pyproject.toml): E, W, F, I, N — style + correctness baseline + naming UP — pyupgrade (py>=3.10 modernizations) B, C4, C90, RET, SIM, TRY — bug catchers PIE, ICN, T20, TC, TID, PTH, PERF — hygiene PLE, PLC, PLW, PLR — pylint port (PLR complexity caps ignored) RUF — ruff-native (incl. RUF100 unused-noqa) What's removed: - .pylintrc (root) — duplicate of the ingestion pylint config - [tool.pylint.*] block in ingestion/pyproject.toml (~140 lines) - ingestion/plugins/{print_checker,import_checker}.py + tests + README (replaced by built-in T20 + TID251 banned-api respectively) - pylint dep from ingestion/setup.py and openmetadata-airflow-apis/pyproject.toml - `make lint` Makefile target + the pylint invocation in py_format_check - dead pylint TODO comment + ignored test entry in noxfile.py Cwd-stable config: ruff is invoked both from the repo root (pre-commit, CI) and from ingestion/ (`make py_format_check`). The `src`, `extend-exclude`, and per-file-ignores entries are listed twice — once relative to ingestion/ and once with the `ingestion/` prefix — so first-party isort detection and exclusions match in both invocations. Grandfathering: ran `ruff check --add-noqa` once + format-stable iteration. ~12,130 noqa directives across ~1,400 files. Cleanup is deferred to follow-up PRs that drop noqas one rule at a time. Documentation sweep: replaced `make lint` references in CLAUDE.md, AGENTS.md, DEVELOPER.md, copilot-instructions, and 6 SKILL files with the apply+verify shape `make py_format && make py_format_check`. `make py_format` is NOT a strict superset of pylint — it only applies auto-fixable violations; `make py_format_check` catches the rest. Basedpyright baseline regenerated: ruff format reflowed multi-line signatures in ~70 files, shifting type-error column positions. The basedpyright baseline matches by (file path, error code, range), so column shifts caused 19 entries to mis-align. Net diff is small (154 lines in/out of the 13MB baseline.json) — purely positional. Verified locally: - make py_format_check → All checks passed - nox --no-venv -s static-checks → 0 errors, 0 warnings, 0 notes * chore(ingestion): finish ruff swap — nox lint session + skill docs Three remaining stale-tooling references after Stage 2c: - `ingestion/noxfile.py` `lint` session was still calling `black --check`, `isort --check-only`, `pycln --diff`. Those tools aren't installed anywhere (we dropped them from dev deps). Replace with the ruff equivalents that mirror `make py_format_check`. - `skills/standards/code_style.md`: stack listed as `black + isort + pycln`; line length claimed 88 (black default). Both wrong: stack is ruff, line length is 120. - `skills/connector-building/SKILL.md`: `make py_format` comment said `# black + isort + pycln`. Same swap. * chore(ingestion): keep main's baseline + globally ignore TRY400 Per gitar-bot's review on PR #27774: 1. Main's PR #27728 promoted ~60 `logger.warning()` → `logger.error()` inside `except` blocks. Those changes landed on main with their own baseline updates. Our PR doesn't promote anything — the merge from origin/main brought those `error` calls along with their baseline entries. The bot interpreted the `# noqa: TRY400` we added next to those lines as us silencing the rule case-by-case. Cleaner: globally ignore TRY400 in pyproject.toml, with a comment explaining why the codebase's `logger.error(...)` + separate `logger.debug(traceback.format_exc())` pattern is intentional. Strip ~430 per-line `# noqa: TRY400` markers from source. 2. Document that `S101` in `per-file-ignores` is a forward-looking entry — flake8-bandit (`S`) is not yet selected, so the rule is no-op today; the entry stays so when `S` lands later, tests don't immediately error. Reverts the platform pin and Linux Docker–generated baseline. Keep main's baseline intact and let CI surface the exact column-shifted entries; the team will decide whether to fix in-place (revert format on affected files) or add per-line `# pyright: ignore` markers. * chore(ingestion): regen baseline for new connector type debt Main's baseline was stale relative to recently-added connectors (McpConnection, CustomDriveConnection) that lack common attributes like `hostPort`, `database`, `catalog` etc. — all sites that access those attributes via the union-typed `serviceConnection.root.config` fire `reportAttributeAccessIssue` errors that aren't baselined. 71 errors + 58 warnings absorbed. Local macOS regen; pushing to see CI's drift count. Per the basedpyright-baseline-and-ci PR experience, macOS↔Linux column drift on this size of regen has historically been 1-7 residuals.
2026-04-28 05:21:59 +00:00
from typing import List # noqa: UP035
chore(ingestion): drop pylint, expand ruff (#27774) * chore(ingestion): drop pylint, expand ruff to Stage 2c Replace pylint with a coherent ruff-only stack (Stage 2c of the modernize roadmap). Pylint is dropped from dev deps and CI workflows; ruff selected ruleset expanded to ~22 families covering style, bug catchers, hygiene, and the pylint port (PLE/PLC/PLW/PLR with the noisy "too-many-X" complexity caps + magic-value disabled). What's selected (with rationale in pyproject.toml): E, W, F, I, N — style + correctness baseline + naming UP — pyupgrade (py>=3.10 modernizations) B, C4, C90, RET, SIM, TRY — bug catchers PIE, ICN, T20, TC, TID, PTH, PERF — hygiene PLE, PLC, PLW, PLR — pylint port (PLR complexity caps ignored) RUF — ruff-native (incl. RUF100 unused-noqa) What's removed: - .pylintrc (root) — duplicate of the ingestion pylint config - [tool.pylint.*] block in ingestion/pyproject.toml (~140 lines) - ingestion/plugins/{print_checker,import_checker}.py + tests + README (replaced by built-in T20 + TID251 banned-api respectively) - pylint dep from ingestion/setup.py and openmetadata-airflow-apis/pyproject.toml - `make lint` Makefile target + the pylint invocation in py_format_check - dead pylint TODO comment + ignored test entry in noxfile.py Cwd-stable config: ruff is invoked both from the repo root (pre-commit, CI) and from ingestion/ (`make py_format_check`). The `src`, `extend-exclude`, and per-file-ignores entries are listed twice — once relative to ingestion/ and once with the `ingestion/` prefix — so first-party isort detection and exclusions match in both invocations. Grandfathering: ran `ruff check --add-noqa` once + format-stable iteration. ~12,130 noqa directives across ~1,400 files. Cleanup is deferred to follow-up PRs that drop noqas one rule at a time. Documentation sweep: replaced `make lint` references in CLAUDE.md, AGENTS.md, DEVELOPER.md, copilot-instructions, and 6 SKILL files with the apply+verify shape `make py_format && make py_format_check`. `make py_format` is NOT a strict superset of pylint — it only applies auto-fixable violations; `make py_format_check` catches the rest. Basedpyright baseline regenerated: ruff format reflowed multi-line signatures in ~70 files, shifting type-error column positions. The basedpyright baseline matches by (file path, error code, range), so column shifts caused 19 entries to mis-align. Net diff is small (154 lines in/out of the 13MB baseline.json) — purely positional. Verified locally: - make py_format_check → All checks passed - nox --no-venv -s static-checks → 0 errors, 0 warnings, 0 notes * chore(ingestion): finish ruff swap — nox lint session + skill docs Three remaining stale-tooling references after Stage 2c: - `ingestion/noxfile.py` `lint` session was still calling `black --check`, `isort --check-only`, `pycln --diff`. Those tools aren't installed anywhere (we dropped them from dev deps). Replace with the ruff equivalents that mirror `make py_format_check`. - `skills/standards/code_style.md`: stack listed as `black + isort + pycln`; line length claimed 88 (black default). Both wrong: stack is ruff, line length is 120. - `skills/connector-building/SKILL.md`: `make py_format` comment said `# black + isort + pycln`. Same swap. * chore(ingestion): keep main's baseline + globally ignore TRY400 Per gitar-bot's review on PR #27774: 1. Main's PR #27728 promoted ~60 `logger.warning()` → `logger.error()` inside `except` blocks. Those changes landed on main with their own baseline updates. Our PR doesn't promote anything — the merge from origin/main brought those `error` calls along with their baseline entries. The bot interpreted the `# noqa: TRY400` we added next to those lines as us silencing the rule case-by-case. Cleaner: globally ignore TRY400 in pyproject.toml, with a comment explaining why the codebase's `logger.error(...)` + separate `logger.debug(traceback.format_exc())` pattern is intentional. Strip ~430 per-line `# noqa: TRY400` markers from source. 2. Document that `S101` in `per-file-ignores` is a forward-looking entry — flake8-bandit (`S`) is not yet selected, so the rule is no-op today; the entry stays so when `S` lands later, tests don't immediately error. Reverts the platform pin and Linux Docker–generated baseline. Keep main's baseline intact and let CI surface the exact column-shifted entries; the team will decide whether to fix in-place (revert format on affected files) or add per-line `# pyright: ignore` markers. * chore(ingestion): regen baseline for new connector type debt Main's baseline was stale relative to recently-added connectors (McpConnection, CustomDriveConnection) that lack common attributes like `hostPort`, `database`, `catalog` etc. — all sites that access those attributes via the union-typed `serviceConnection.root.config` fire `reportAttributeAccessIssue` errors that aren't baselined. 71 errors + 58 warnings absorbed. Local macOS regen; pushing to see CI's drift count. Per the basedpyright-baseline-and-ci PR experience, macOS↔Linux column drift on this size of regen has historically been 1-7 residuals.
2026-04-28 05:21:59 +00:00
from .common.test_cli_db import CliCommonDB # noqa: TID252
from .common_e2e_sqa_mixins import SQACommonMethods # noqa: TID252
class PostgresCliTest(CliCommonDB.TestSuite, SQACommonMethods):
create_table_query: str = """
CREATE TABLE IF NOT EXISTS public.all_datatypes (
column1 bigint,
column2 bigserial,
column5 boolean,
column6 character(10),
column7 character varying(10),
column8 date,
column9 double precision,
column10 integer,
column11 interval,
column12 json,
column13 jsonb,
column14 numeric(10,2),
column15 real,
column16 smallint,
column17 smallserial,
column28 serial,
column29 text,
column20 time without time zone,
column21 time with time zone,
column22 timestamp without time zone,
column23 timestamp with time zone,
column24 uuid
);
"""
create_view_query: str = """
CREATE OR REPLACE VIEW public.view_all_datatypes AS
SELECT *
FROM public.all_datatypes;
"""
chore(ingestion): drop pylint, expand ruff (#27774) * chore(ingestion): drop pylint, expand ruff to Stage 2c Replace pylint with a coherent ruff-only stack (Stage 2c of the modernize roadmap). Pylint is dropped from dev deps and CI workflows; ruff selected ruleset expanded to ~22 families covering style, bug catchers, hygiene, and the pylint port (PLE/PLC/PLW/PLR with the noisy "too-many-X" complexity caps + magic-value disabled). What's selected (with rationale in pyproject.toml): E, W, F, I, N — style + correctness baseline + naming UP — pyupgrade (py>=3.10 modernizations) B, C4, C90, RET, SIM, TRY — bug catchers PIE, ICN, T20, TC, TID, PTH, PERF — hygiene PLE, PLC, PLW, PLR — pylint port (PLR complexity caps ignored) RUF — ruff-native (incl. RUF100 unused-noqa) What's removed: - .pylintrc (root) — duplicate of the ingestion pylint config - [tool.pylint.*] block in ingestion/pyproject.toml (~140 lines) - ingestion/plugins/{print_checker,import_checker}.py + tests + README (replaced by built-in T20 + TID251 banned-api respectively) - pylint dep from ingestion/setup.py and openmetadata-airflow-apis/pyproject.toml - `make lint` Makefile target + the pylint invocation in py_format_check - dead pylint TODO comment + ignored test entry in noxfile.py Cwd-stable config: ruff is invoked both from the repo root (pre-commit, CI) and from ingestion/ (`make py_format_check`). The `src`, `extend-exclude`, and per-file-ignores entries are listed twice — once relative to ingestion/ and once with the `ingestion/` prefix — so first-party isort detection and exclusions match in both invocations. Grandfathering: ran `ruff check --add-noqa` once + format-stable iteration. ~12,130 noqa directives across ~1,400 files. Cleanup is deferred to follow-up PRs that drop noqas one rule at a time. Documentation sweep: replaced `make lint` references in CLAUDE.md, AGENTS.md, DEVELOPER.md, copilot-instructions, and 6 SKILL files with the apply+verify shape `make py_format && make py_format_check`. `make py_format` is NOT a strict superset of pylint — it only applies auto-fixable violations; `make py_format_check` catches the rest. Basedpyright baseline regenerated: ruff format reflowed multi-line signatures in ~70 files, shifting type-error column positions. The basedpyright baseline matches by (file path, error code, range), so column shifts caused 19 entries to mis-align. Net diff is small (154 lines in/out of the 13MB baseline.json) — purely positional. Verified locally: - make py_format_check → All checks passed - nox --no-venv -s static-checks → 0 errors, 0 warnings, 0 notes * chore(ingestion): finish ruff swap — nox lint session + skill docs Three remaining stale-tooling references after Stage 2c: - `ingestion/noxfile.py` `lint` session was still calling `black --check`, `isort --check-only`, `pycln --diff`. Those tools aren't installed anywhere (we dropped them from dev deps). Replace with the ruff equivalents that mirror `make py_format_check`. - `skills/standards/code_style.md`: stack listed as `black + isort + pycln`; line length claimed 88 (black default). Both wrong: stack is ruff, line length is 120. - `skills/connector-building/SKILL.md`: `make py_format` comment said `# black + isort + pycln`. Same swap. * chore(ingestion): keep main's baseline + globally ignore TRY400 Per gitar-bot's review on PR #27774: 1. Main's PR #27728 promoted ~60 `logger.warning()` → `logger.error()` inside `except` blocks. Those changes landed on main with their own baseline updates. Our PR doesn't promote anything — the merge from origin/main brought those `error` calls along with their baseline entries. The bot interpreted the `# noqa: TRY400` we added next to those lines as us silencing the rule case-by-case. Cleaner: globally ignore TRY400 in pyproject.toml, with a comment explaining why the codebase's `logger.error(...)` + separate `logger.debug(traceback.format_exc())` pattern is intentional. Strip ~430 per-line `# noqa: TRY400` markers from source. 2. Document that `S101` in `per-file-ignores` is a forward-looking entry — flake8-bandit (`S`) is not yet selected, so the rule is no-op today; the entry stays so when `S` lands later, tests don't immediately error. Reverts the platform pin and Linux Docker–generated baseline. Keep main's baseline intact and let CI surface the exact column-shifted entries; the team will decide whether to fix in-place (revert format on affected files) or add per-line `# pyright: ignore` markers. * chore(ingestion): regen baseline for new connector type debt Main's baseline was stale relative to recently-added connectors (McpConnection, CustomDriveConnection) that lack common attributes like `hostPort`, `database`, `catalog` etc. — all sites that access those attributes via the union-typed `serviceConnection.root.config` fire `reportAttributeAccessIssue` errors that aren't baselined. 71 errors + 58 warnings absorbed. Local macOS regen; pushing to see CI's drift count. Per the basedpyright-baseline-and-ci PR experience, macOS↔Linux column drift on this size of regen has historically been 1-7 residuals.
2026-04-28 05:21:59 +00:00
insert_data_queries: List[str] = [ # noqa: RUF012, UP006
"""
INSERT INTO public.all_datatypes VALUES (
1,
2,
true,
'abcdefghij',
'abcdefghij',
'2022-08-08',
1234.5678,
1234567890,
'1 day 2 hours 3 minutes 4 seconds'::interval,
'{"a":1,"b":2}',
'{"a":1,"b":2}',
1234.56,
1234.5678::real,
32767::smallint,
32767,
2147483647,
'abcdefghij',
'12:34:56'::time without time zone ,
'12:34:56+02'::time with time zone ,
'2022-08-08 12:34:56'::timestamp without time zone ,
'2022-08-08 12:34:56+02'::timestamp with time zone ,
'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::uuid
)""",
]
drop_table_query: str = """
DROP TABLE IF EXISTS public.all_datatypes;
"""
drop_view_query: str = """
DROP VIEW IF EXISTS public.view_all_datatypes;
"""
@staticmethod
def get_connector_name() -> str:
return "postgres"
def create_table_and_view(self) -> None:
SQACommonMethods.create_table_and_view(self)
def delete_table_and_view(self) -> None:
SQACommonMethods.delete_table_and_view(self)
@staticmethod
def expected_tables() -> int:
return 2
def expected_sample_size(self) -> int:
return len(self.insert_data_queries)
def view_column_lineage_count(self) -> int:
2025-01-09 09:29:49 +00:00
return 22
def expected_lineage_node(self) -> str:
return "local_postgres.E2EDB.public.view_all_datatypes"
@staticmethod
def fqn_created_table() -> str:
return "local_postgres.E2EDB.public.all_datatypes"
@staticmethod
chore(ingestion): drop pylint, expand ruff (#27774) * chore(ingestion): drop pylint, expand ruff to Stage 2c Replace pylint with a coherent ruff-only stack (Stage 2c of the modernize roadmap). Pylint is dropped from dev deps and CI workflows; ruff selected ruleset expanded to ~22 families covering style, bug catchers, hygiene, and the pylint port (PLE/PLC/PLW/PLR with the noisy "too-many-X" complexity caps + magic-value disabled). What's selected (with rationale in pyproject.toml): E, W, F, I, N — style + correctness baseline + naming UP — pyupgrade (py>=3.10 modernizations) B, C4, C90, RET, SIM, TRY — bug catchers PIE, ICN, T20, TC, TID, PTH, PERF — hygiene PLE, PLC, PLW, PLR — pylint port (PLR complexity caps ignored) RUF — ruff-native (incl. RUF100 unused-noqa) What's removed: - .pylintrc (root) — duplicate of the ingestion pylint config - [tool.pylint.*] block in ingestion/pyproject.toml (~140 lines) - ingestion/plugins/{print_checker,import_checker}.py + tests + README (replaced by built-in T20 + TID251 banned-api respectively) - pylint dep from ingestion/setup.py and openmetadata-airflow-apis/pyproject.toml - `make lint` Makefile target + the pylint invocation in py_format_check - dead pylint TODO comment + ignored test entry in noxfile.py Cwd-stable config: ruff is invoked both from the repo root (pre-commit, CI) and from ingestion/ (`make py_format_check`). The `src`, `extend-exclude`, and per-file-ignores entries are listed twice — once relative to ingestion/ and once with the `ingestion/` prefix — so first-party isort detection and exclusions match in both invocations. Grandfathering: ran `ruff check --add-noqa` once + format-stable iteration. ~12,130 noqa directives across ~1,400 files. Cleanup is deferred to follow-up PRs that drop noqas one rule at a time. Documentation sweep: replaced `make lint` references in CLAUDE.md, AGENTS.md, DEVELOPER.md, copilot-instructions, and 6 SKILL files with the apply+verify shape `make py_format && make py_format_check`. `make py_format` is NOT a strict superset of pylint — it only applies auto-fixable violations; `make py_format_check` catches the rest. Basedpyright baseline regenerated: ruff format reflowed multi-line signatures in ~70 files, shifting type-error column positions. The basedpyright baseline matches by (file path, error code, range), so column shifts caused 19 entries to mis-align. Net diff is small (154 lines in/out of the 13MB baseline.json) — purely positional. Verified locally: - make py_format_check → All checks passed - nox --no-venv -s static-checks → 0 errors, 0 warnings, 0 notes * chore(ingestion): finish ruff swap — nox lint session + skill docs Three remaining stale-tooling references after Stage 2c: - `ingestion/noxfile.py` `lint` session was still calling `black --check`, `isort --check-only`, `pycln --diff`. Those tools aren't installed anywhere (we dropped them from dev deps). Replace with the ruff equivalents that mirror `make py_format_check`. - `skills/standards/code_style.md`: stack listed as `black + isort + pycln`; line length claimed 88 (black default). Both wrong: stack is ruff, line length is 120. - `skills/connector-building/SKILL.md`: `make py_format` comment said `# black + isort + pycln`. Same swap. * chore(ingestion): keep main's baseline + globally ignore TRY400 Per gitar-bot's review on PR #27774: 1. Main's PR #27728 promoted ~60 `logger.warning()` → `logger.error()` inside `except` blocks. Those changes landed on main with their own baseline updates. Our PR doesn't promote anything — the merge from origin/main brought those `error` calls along with their baseline entries. The bot interpreted the `# noqa: TRY400` we added next to those lines as us silencing the rule case-by-case. Cleaner: globally ignore TRY400 in pyproject.toml, with a comment explaining why the codebase's `logger.error(...)` + separate `logger.debug(traceback.format_exc())` pattern is intentional. Strip ~430 per-line `# noqa: TRY400` markers from source. 2. Document that `S101` in `per-file-ignores` is a forward-looking entry — flake8-bandit (`S`) is not yet selected, so the rule is no-op today; the entry stays so when `S` lands later, tests don't immediately error. Reverts the platform pin and Linux Docker–generated baseline. Keep main's baseline intact and let CI surface the exact column-shifted entries; the team will decide whether to fix in-place (revert format on affected files) or add per-line `# pyright: ignore` markers. * chore(ingestion): regen baseline for new connector type debt Main's baseline was stale relative to recently-added connectors (McpConnection, CustomDriveConnection) that lack common attributes like `hostPort`, `database`, `catalog` etc. — all sites that access those attributes via the union-typed `serviceConnection.root.config` fire `reportAttributeAccessIssue` errors that aren't baselined. 71 errors + 58 warnings absorbed. Local macOS regen; pushing to see CI's drift count. Per the basedpyright-baseline-and-ci PR experience, macOS↔Linux column drift on this size of regen has historically been 1-7 residuals.
2026-04-28 05:21:59 +00:00
def get_includes_schemas() -> List[str]: # noqa: UP006
return ["public"]
@staticmethod
chore(ingestion): drop pylint, expand ruff (#27774) * chore(ingestion): drop pylint, expand ruff to Stage 2c Replace pylint with a coherent ruff-only stack (Stage 2c of the modernize roadmap). Pylint is dropped from dev deps and CI workflows; ruff selected ruleset expanded to ~22 families covering style, bug catchers, hygiene, and the pylint port (PLE/PLC/PLW/PLR with the noisy "too-many-X" complexity caps + magic-value disabled). What's selected (with rationale in pyproject.toml): E, W, F, I, N — style + correctness baseline + naming UP — pyupgrade (py>=3.10 modernizations) B, C4, C90, RET, SIM, TRY — bug catchers PIE, ICN, T20, TC, TID, PTH, PERF — hygiene PLE, PLC, PLW, PLR — pylint port (PLR complexity caps ignored) RUF — ruff-native (incl. RUF100 unused-noqa) What's removed: - .pylintrc (root) — duplicate of the ingestion pylint config - [tool.pylint.*] block in ingestion/pyproject.toml (~140 lines) - ingestion/plugins/{print_checker,import_checker}.py + tests + README (replaced by built-in T20 + TID251 banned-api respectively) - pylint dep from ingestion/setup.py and openmetadata-airflow-apis/pyproject.toml - `make lint` Makefile target + the pylint invocation in py_format_check - dead pylint TODO comment + ignored test entry in noxfile.py Cwd-stable config: ruff is invoked both from the repo root (pre-commit, CI) and from ingestion/ (`make py_format_check`). The `src`, `extend-exclude`, and per-file-ignores entries are listed twice — once relative to ingestion/ and once with the `ingestion/` prefix — so first-party isort detection and exclusions match in both invocations. Grandfathering: ran `ruff check --add-noqa` once + format-stable iteration. ~12,130 noqa directives across ~1,400 files. Cleanup is deferred to follow-up PRs that drop noqas one rule at a time. Documentation sweep: replaced `make lint` references in CLAUDE.md, AGENTS.md, DEVELOPER.md, copilot-instructions, and 6 SKILL files with the apply+verify shape `make py_format && make py_format_check`. `make py_format` is NOT a strict superset of pylint — it only applies auto-fixable violations; `make py_format_check` catches the rest. Basedpyright baseline regenerated: ruff format reflowed multi-line signatures in ~70 files, shifting type-error column positions. The basedpyright baseline matches by (file path, error code, range), so column shifts caused 19 entries to mis-align. Net diff is small (154 lines in/out of the 13MB baseline.json) — purely positional. Verified locally: - make py_format_check → All checks passed - nox --no-venv -s static-checks → 0 errors, 0 warnings, 0 notes * chore(ingestion): finish ruff swap — nox lint session + skill docs Three remaining stale-tooling references after Stage 2c: - `ingestion/noxfile.py` `lint` session was still calling `black --check`, `isort --check-only`, `pycln --diff`. Those tools aren't installed anywhere (we dropped them from dev deps). Replace with the ruff equivalents that mirror `make py_format_check`. - `skills/standards/code_style.md`: stack listed as `black + isort + pycln`; line length claimed 88 (black default). Both wrong: stack is ruff, line length is 120. - `skills/connector-building/SKILL.md`: `make py_format` comment said `# black + isort + pycln`. Same swap. * chore(ingestion): keep main's baseline + globally ignore TRY400 Per gitar-bot's review on PR #27774: 1. Main's PR #27728 promoted ~60 `logger.warning()` → `logger.error()` inside `except` blocks. Those changes landed on main with their own baseline updates. Our PR doesn't promote anything — the merge from origin/main brought those `error` calls along with their baseline entries. The bot interpreted the `# noqa: TRY400` we added next to those lines as us silencing the rule case-by-case. Cleaner: globally ignore TRY400 in pyproject.toml, with a comment explaining why the codebase's `logger.error(...)` + separate `logger.debug(traceback.format_exc())` pattern is intentional. Strip ~430 per-line `# noqa: TRY400` markers from source. 2. Document that `S101` in `per-file-ignores` is a forward-looking entry — flake8-bandit (`S`) is not yet selected, so the rule is no-op today; the entry stays so when `S` lands later, tests don't immediately error. Reverts the platform pin and Linux Docker–generated baseline. Keep main's baseline intact and let CI surface the exact column-shifted entries; the team will decide whether to fix in-place (revert format on affected files) or add per-line `# pyright: ignore` markers. * chore(ingestion): regen baseline for new connector type debt Main's baseline was stale relative to recently-added connectors (McpConnection, CustomDriveConnection) that lack common attributes like `hostPort`, `database`, `catalog` etc. — all sites that access those attributes via the union-typed `serviceConnection.root.config` fire `reportAttributeAccessIssue` errors that aren't baselined. 71 errors + 58 warnings absorbed. Local macOS regen; pushing to see CI's drift count. Per the basedpyright-baseline-and-ci PR experience, macOS↔Linux column drift on this size of regen has historically been 1-7 residuals.
2026-04-28 05:21:59 +00:00
def get_includes_tables() -> List[str]: # noqa: UP006
return [".*all_datatypes.*"]
@staticmethod
chore(ingestion): drop pylint, expand ruff (#27774) * chore(ingestion): drop pylint, expand ruff to Stage 2c Replace pylint with a coherent ruff-only stack (Stage 2c of the modernize roadmap). Pylint is dropped from dev deps and CI workflows; ruff selected ruleset expanded to ~22 families covering style, bug catchers, hygiene, and the pylint port (PLE/PLC/PLW/PLR with the noisy "too-many-X" complexity caps + magic-value disabled). What's selected (with rationale in pyproject.toml): E, W, F, I, N — style + correctness baseline + naming UP — pyupgrade (py>=3.10 modernizations) B, C4, C90, RET, SIM, TRY — bug catchers PIE, ICN, T20, TC, TID, PTH, PERF — hygiene PLE, PLC, PLW, PLR — pylint port (PLR complexity caps ignored) RUF — ruff-native (incl. RUF100 unused-noqa) What's removed: - .pylintrc (root) — duplicate of the ingestion pylint config - [tool.pylint.*] block in ingestion/pyproject.toml (~140 lines) - ingestion/plugins/{print_checker,import_checker}.py + tests + README (replaced by built-in T20 + TID251 banned-api respectively) - pylint dep from ingestion/setup.py and openmetadata-airflow-apis/pyproject.toml - `make lint` Makefile target + the pylint invocation in py_format_check - dead pylint TODO comment + ignored test entry in noxfile.py Cwd-stable config: ruff is invoked both from the repo root (pre-commit, CI) and from ingestion/ (`make py_format_check`). The `src`, `extend-exclude`, and per-file-ignores entries are listed twice — once relative to ingestion/ and once with the `ingestion/` prefix — so first-party isort detection and exclusions match in both invocations. Grandfathering: ran `ruff check --add-noqa` once + format-stable iteration. ~12,130 noqa directives across ~1,400 files. Cleanup is deferred to follow-up PRs that drop noqas one rule at a time. Documentation sweep: replaced `make lint` references in CLAUDE.md, AGENTS.md, DEVELOPER.md, copilot-instructions, and 6 SKILL files with the apply+verify shape `make py_format && make py_format_check`. `make py_format` is NOT a strict superset of pylint — it only applies auto-fixable violations; `make py_format_check` catches the rest. Basedpyright baseline regenerated: ruff format reflowed multi-line signatures in ~70 files, shifting type-error column positions. The basedpyright baseline matches by (file path, error code, range), so column shifts caused 19 entries to mis-align. Net diff is small (154 lines in/out of the 13MB baseline.json) — purely positional. Verified locally: - make py_format_check → All checks passed - nox --no-venv -s static-checks → 0 errors, 0 warnings, 0 notes * chore(ingestion): finish ruff swap — nox lint session + skill docs Three remaining stale-tooling references after Stage 2c: - `ingestion/noxfile.py` `lint` session was still calling `black --check`, `isort --check-only`, `pycln --diff`. Those tools aren't installed anywhere (we dropped them from dev deps). Replace with the ruff equivalents that mirror `make py_format_check`. - `skills/standards/code_style.md`: stack listed as `black + isort + pycln`; line length claimed 88 (black default). Both wrong: stack is ruff, line length is 120. - `skills/connector-building/SKILL.md`: `make py_format` comment said `# black + isort + pycln`. Same swap. * chore(ingestion): keep main's baseline + globally ignore TRY400 Per gitar-bot's review on PR #27774: 1. Main's PR #27728 promoted ~60 `logger.warning()` → `logger.error()` inside `except` blocks. Those changes landed on main with their own baseline updates. Our PR doesn't promote anything — the merge from origin/main brought those `error` calls along with their baseline entries. The bot interpreted the `# noqa: TRY400` we added next to those lines as us silencing the rule case-by-case. Cleaner: globally ignore TRY400 in pyproject.toml, with a comment explaining why the codebase's `logger.error(...)` + separate `logger.debug(traceback.format_exc())` pattern is intentional. Strip ~430 per-line `# noqa: TRY400` markers from source. 2. Document that `S101` in `per-file-ignores` is a forward-looking entry — flake8-bandit (`S`) is not yet selected, so the rule is no-op today; the entry stays so when `S` lands later, tests don't immediately error. Reverts the platform pin and Linux Docker–generated baseline. Keep main's baseline intact and let CI surface the exact column-shifted entries; the team will decide whether to fix in-place (revert format on affected files) or add per-line `# pyright: ignore` markers. * chore(ingestion): regen baseline for new connector type debt Main's baseline was stale relative to recently-added connectors (McpConnection, CustomDriveConnection) that lack common attributes like `hostPort`, `database`, `catalog` etc. — all sites that access those attributes via the union-typed `serviceConnection.root.config` fire `reportAttributeAccessIssue` errors that aren't baselined. 71 errors + 58 warnings absorbed. Local macOS regen; pushing to see CI's drift count. Per the basedpyright-baseline-and-ci PR experience, macOS↔Linux column drift on this size of regen has historically been 1-7 residuals.
2026-04-28 05:21:59 +00:00
def get_excludes_tables() -> List[str]: # noqa: UP006
return [".*test_empty.*"]
@staticmethod
def expected_filtered_schema_includes() -> int:
return 1
@staticmethod
def expected_filtered_schema_excludes() -> int:
return 1
@staticmethod
def expected_filtered_table_includes() -> int:
return 66
@staticmethod
def expected_filtered_table_excludes() -> int:
return 0
@staticmethod
def expected_filtered_mix() -> int:
return 2