mirror of
https://github.com/open-metadata/OpenMetadata
synced 2026-05-24 09:39:11 +00:00
* chore(ingestion): enable basedpyright across the codebase via baseline
Removes the ~25 paths from `[tool.basedpyright] ignore` (which excluded
roughly 90% of the codebase from type checking) and grandfathers the
existing violations into a baseline file. New violations in any
previously-ignored file now fail CI.
Changes:
- ingestion/pyproject.toml: drop the entire `ignore = [...]` block
- ingestion/setup.py: bump `basedpyright~=1.14` to `~=1.39.0`
- ingestion/.basedpyright/baseline.json (new, ~13MB): captures the
starting violation set (~18.8K errors + ~37.4K warnings) so the
migration is behavior-preserving. Regenerate with
`cd ingestion && basedpyright -p pyproject.toml --baselinefile
.basedpyright/baseline.json --writebaseline`. basedpyright analysis
has minor non-determinism (similar to ruff's), so re-running
--writebaseline a few times converges the baseline.
- ingestion/noxfile.py: pass `--baselinefile .basedpyright/baseline.json`
to the basedpyright invocation in the `static-checks` session so CI
honors the grandfathering. CI already runs the session via
`cd ingestion && nox --no-venv -s static-checks` (py-tests.yml).
- ingestion/Makefile: `make static-checks` now delegates to
`nox -s static-checks` so local invocations match CI exactly. Also
drops the dead Python 3.9 / OM_SKIP_SDK_PY39 branch (we require
Python >=3.10 since the previous modernization PR).
- .gitignore: add `.serena/` (local language-server cache)
* chore(ingestion): add nox to the dev dependency set
The static-checks Makefile target and the py-tests CI job both delegate
to `nox -s static-checks`, but nox was being installed as a separate
side step (`pip install nox` in `install_dev_env`, `uv pip install nox`
in the test-environment composite action). Listing it in dev extras
means a plain `pip install ingestion[dev]` brings it in.
* chore(ingestion): pin basedpyright analysis to py3.10; CI runs once
Following the basedpyright + multi-Python-version research:
- ingestion/pyproject.toml: add `pythonVersion = "3.10"` to
[tool.basedpyright] so type-checking always analyzes for the lowest
supported Python version. Forward-incompatible code (tomllib usage,
PEP 695 generics, etc.) is caught at type-check time regardless of
which Python interpreter runs the checker.
- .github/workflows/py-tests.yml: gate the "Run Static Checks" step on
`matrix.py-version == '3.10'`. With pythonVersion pinned, results are
identical across the matrix; running once avoids redundant work and
keeps the baseline file deterministic. Unit tests still run on the
full 3.10/3.11/3.12 matrix to verify runtime compatibility.
- ingestion/.basedpyright/baseline.json: regenerated cleanly with the
new pythonVersion config (~18.8K errors / ~37.3K warnings, similar
scale to the previous baseline). Aligns with the canonical
type-check-on-floor / test-on-matrix pattern used by Pydantic, CPython,
and other major Python projects.
* chore(ingestion): pin basedpyright pythonPlatform to Linux + regen baseline
CI's previous run still surfaced ~9 issues (2 errors + 7 warnings) that
weren't in the baseline. Root cause: my local environment differs from
CI's in three ways that affect type inference — Python interpreter
(3.11 vs 3.10), platform (Darwin vs Linux), and pip-resolved package
versions (couchbase, avro, trino, sqlalchemy stubs all differ slightly).
This commit closes the platform gap and regenerates the baseline from a
fresh CI-equivalent environment:
- ingestion/pyproject.toml: add `pythonPlatform = "Linux"` to
[tool.basedpyright] so type-checking uses the Linux subset of stdlib /
third-party stubs regardless of where the analyzer runs.
- ingestion/.basedpyright/baseline.json: regenerated against a fresh
Python 3.10 venv installed via `uv pip install ingestion[test]` (the
same install path CI's setup-openmetadata-test-environment composite
action uses). New scale: ~18.7K errors / ~37.5K warnings — same
ballpark as the previous baseline, with column positions now matching
CI's environment.
Local-developer note: when running `make static-checks` from a venv
that doesn't mirror CI exactly (e.g. macOS, Python 3.11, different
package versions), you may see drift errors. The supported workflow for
regenerating the baseline is to mirror CI:
python3.10 -m venv /tmp/ci-mirror
source /tmp/ci-mirror/bin/activate
uv pip install --upgrade pip "setuptools<81"
uv pip install --no-build-isolation "cx_Oracle>=8.3.0,<9"
uv pip install -e "ingestion[test]"
uv pip install "basedpyright~=1.39.0" nox
cd ingestion && basedpyright -p pyproject.toml \
--baselinefile .basedpyright/baseline.json --writebaseline
* chore(ingestion): drop pythonPlatform pin and regen baseline from CI-mirror
The previous attempt added `pythonPlatform = "Linux"` thinking it would
make the local-generated baseline match CI. It did the opposite — Linux
platform stubs activate additional conditional code paths that weren't
analyzed before, so CI saw 101 errors instead of the prior 2 errors.
Reverting:
- Drop `pythonPlatform = "Linux"` from [tool.basedpyright]. Without it,
basedpyright analyzes for the host platform; on CI's ubuntu-latest
runner that's Linux automatically, but type-stub coverage stays the
same as before (matching the d9196dff6b baseline).
- Regenerate ingestion/.basedpyright/baseline.json against a fresh
Python 3.10 venv installed via `uv pip install ingestion[test]`
(mirroring CI's setup-openmetadata-test-environment composite action).
~18.8K errors / 37.7K warnings captured — same scale as the working
d9196dff6b version.
Local-developer note: any baseline regeneration done on macOS will drift
from CI's Linux env (different transitive package versions, different
stubs). The supported local mirror procedure:
python3.10 -m venv /tmp/ci-mirror
source /tmp/ci-mirror/bin/activate
uv pip install --upgrade pip "setuptools<81"
uv pip install --no-build-isolation "cx_Oracle>=8.3.0,<9"
uv pip install -e "ingestion[test]"
uv pip install "basedpyright~=1.39.0" nox
cd ingestion && basedpyright -p pyproject.toml \\
--baselinefile .basedpyright/baseline.json --writebaseline
* chore(ingestion): regen baseline from full CI install (mac arm64 mirror)
Prior CI-mirror only installed [test], skipping [all] and the four
--no-deps SA pins (sqlalchemy-redshift/databricks/ibmi, pydoris-custom).
That left ~75 connector packages out of the analysis env, so basedpyright
couldn't resolve types from databricks.sqlalchemy, GE 0.18 Batch,
sklearn BaseEstimator, airflow SQLAlchemy models, pandas/numpy stubs,
etc. CI saw 129 errors absent from the baseline.
Regenerated against a fresh py3.10 venv that mirrors
.github/actions/setup-openmetadata-test-environment exactly:
uv pip install ./ingestion[dev]
make generate
uv pip install "setuptools<81"
uv pip install --no-build-isolation "cx_Oracle>=8.3.0,<9"
uv pip install --no-deps sqlalchemy-redshift==0.8.14 \
sqlalchemy-databricks==0.2.0 \
sqlalchemy-ibmi==0.9.3 \
pydoris-custom==1.1.0
uv pip install ./ingestion[all]
uv pip install ./ingestion[test]
uv pip install nox
First run: 128 errors, 272 warnings — within 1 error of CI's 129/272.
Wrote baseline with 56,100 entries across 1,035 files. Verify run with
the new baseline reports 0/0/0.
macOS arm64 vs Linux x86_64 wheel resolution may leave a small residual
(~3-7 errors per the d9196dff6b precedent). Re-run --writebaseline 2-3x
if any show up in CI.
* chore(ingestion): silence avro.py:95 basedpyright residual
CI's Linux fastavro stub returns Schema as `str | List[Any]`, while
the macOS arm64 wheel narrows to `str` — the only error not absorbed
by the regenerated baseline. Add a targeted pyright: ignore on the
parse_avro_schema call instead of broadening behavior.
* chore(ingestion): tolerate cross-platform pyright ignore drift
CI's `--baselinemode=lock` (default) requires the baseline to match
exactly — neither up nor down. Two related issues:
1. The avro.py noqa silenced not just the surfaced error but 10
cascading entries at line 95 (sub-errors propagating from the
unresolved `schema` arg type). Baseline went `down by 10` → lock
violated → exit 3 even with `0 errors` reported. Regenerate baseline
so the 10 stale entries are dropped.
2. The macOS arm64 fastavro stub doesn't surface that error in the
first place, so basedpyright treats the noqa as
`reportUnnecessaryTypeIgnoreComment` locally — causing the opposite
lock mismatch on CI (a warning entry that doesn't exist there).
Disable the rule so platform-specific residuals can land without
flapping between local and CI.
* chore(ingestion): use --baselinemode=discard for cross-platform tolerance
CI's implicit default is `lock`, which fails on any baseline change in
either direction (errors going up *or* down) via console.error → exit 3.
That cannot accommodate macOS arm64 vs Linux x86_64 stub drift: a
baseline regenerated locally always carries some entries that don't fire
on CI (and vice versa).
`auto` would tolerate the drift but silently overwrites the baseline
file — unacceptable in CI, where unreviewed changes never get committed
back.
`discard` is the right balance:
- New errors not in the baseline still fail the run (early-return path
in BaselineHandler.write before the lock/discard branch).
- Stale baseline entries (errors that no longer fire on the current
platform) print an info message and exit 0.
- The baseline file is never modified.
313 lines
11 KiB
YAML
313 lines
11 KiB
YAML
# Copyright 2021 Collate
|
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
# you may not use this file except in compliance with the License.
|
|
# You may obtain a copy of the License at
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
|
|
name: py-tests
|
|
on:
|
|
merge_group:
|
|
workflow_dispatch:
|
|
pull_request_target:
|
|
types: [labeled, opened, synchronize, reopened, ready_for_review]
|
|
|
|
permissions:
|
|
contents: read
|
|
|
|
concurrency:
|
|
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
|
|
cancel-in-progress: true
|
|
|
|
env:
|
|
# matrix can't use 'env'. When updating it, update it for both jobs.
|
|
MAIN_PYTHON_VERSION: "3.10"
|
|
SONAR_OPTS: >-
|
|
-Dsonar.pullrequest.key=${{ github.event.pull_request.number }}
|
|
-Dsonar.pullrequest.branch=${{ github.event.pull_request.head.ref }}
|
|
-Dsonar.pullrequest.github.repository=OpenMetadata
|
|
-Dsonar.scm.revision=${{ github.event.pull_request.head.sha }}
|
|
-Dsonar.pullrequest.provider=github
|
|
|
|
jobs:
|
|
# Detect whether relevant paths changed. When no Python/service/schema files
|
|
# are modified the downstream jobs are skipped via their `if` condition.
|
|
# A job skipped by `if` reports as "Success", so required checks still pass.
|
|
# This replaces the old py-tests-skip.yml companion workflow.
|
|
changes:
|
|
name: Detect Changes
|
|
runs-on: ubuntu-latest
|
|
if: ${{ !github.event.pull_request.draft }}
|
|
outputs:
|
|
python: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.filter.outputs.python }}
|
|
steps:
|
|
- uses: dorny/paths-filter@v3
|
|
id: filter
|
|
if: ${{ github.event_name != 'workflow_dispatch' }}
|
|
with:
|
|
filters: |
|
|
python:
|
|
- 'ingestion/**'
|
|
- 'openmetadata-service/**'
|
|
- 'openmetadata-spec/src/main/resources/json/schema/**'
|
|
- 'pom.xml'
|
|
- 'Makefile'
|
|
|
|
py-unit-tests:
|
|
name: Unit Tests & Static Checks
|
|
needs: changes
|
|
if: ${{ needs.changes.outputs.python == 'true' }}
|
|
timeout-minutes: 60
|
|
runs-on: ubuntu-latest
|
|
strategy:
|
|
fail-fast: false
|
|
matrix:
|
|
py-version: ["3.10", "3.11", "3.12"]
|
|
steps:
|
|
|
|
- name: Wait for the labeler
|
|
uses: lewagon/wait-on-check-action@v1.3.4
|
|
if: ${{ github.event_name == 'pull_request_target' }}
|
|
with:
|
|
ref: ${{ github.event.pull_request.head.sha }}
|
|
check-name: Team Label
|
|
repo-token: ${{ secrets.GITHUB_TOKEN }}
|
|
wait-interval: 30
|
|
|
|
- name: Verify PR labels
|
|
uses: jesusvasquez333/verify-pr-label-action@v1.4.0
|
|
if: ${{ github.event_name == 'pull_request_target' }}
|
|
with:
|
|
github-token: "${{ secrets.GITHUB_TOKEN }}"
|
|
valid-labels: "safe to test"
|
|
pull-request-number: "${{ github.event.pull_request.number }}"
|
|
disable-reviews: true
|
|
|
|
- name: Checkout
|
|
uses: actions/checkout@v4
|
|
with:
|
|
ref: ${{ github.event_name == 'merge_group' && github.sha || github.event.pull_request.head.sha }}
|
|
|
|
- name: Setup Openmetadata Test Environment
|
|
uses: ./.github/actions/setup-openmetadata-test-environment
|
|
with:
|
|
python-version: ${{ matrix.py-version }}
|
|
install-server: 'false'
|
|
|
|
- name: Run Static Checks
|
|
# basedpyright is configured with `pythonVersion = "3.10"` (the lowest
|
|
# supported version) so type-checking results are identical across the
|
|
# 3.10/3.11/3.12 matrix. Run on the lowest version only to avoid
|
|
# redundant work and keep the baseline file deterministic.
|
|
if: matrix.py-version == '3.10'
|
|
run: |
|
|
source env/bin/activate
|
|
cd ingestion
|
|
nox --no-venv -s static-checks
|
|
shell: bash
|
|
|
|
- name: Run Unit Tests
|
|
run: |
|
|
source env/bin/activate
|
|
cd ingestion
|
|
nox --no-venv -s unit-tests
|
|
shell: bash
|
|
|
|
- name: Upload coverage artifact
|
|
if: ${{ matrix.py-version == env.MAIN_PYTHON_VERSION && !cancelled() }}
|
|
uses: actions/upload-artifact@v4
|
|
with:
|
|
name: coverage-unit
|
|
path: ingestion/.coverage
|
|
include-hidden-files: true
|
|
|
|
py-integration-tests:
|
|
name: "Integration Tests (${{ matrix.shard.name }}, ${{ matrix.py-version }})"
|
|
needs: changes
|
|
if: ${{ needs.changes.outputs.python == 'true' }}
|
|
timeout-minutes: 180
|
|
runs-on: ubuntu-latest
|
|
strategy:
|
|
fail-fast: false
|
|
matrix:
|
|
py-version: ["3.10", "3.11", "3.12"]
|
|
shard:
|
|
- name: "shard-1"
|
|
nox-args: >-
|
|
tests/integration/ometa
|
|
tests/integration/postgres
|
|
tests/integration/mysql
|
|
tests/integration/profiler
|
|
tests/integration/data_quality
|
|
- name: "shard-2"
|
|
nox-args: >-
|
|
--ignore=tests/integration/ometa
|
|
--ignore=tests/integration/postgres
|
|
--ignore=tests/integration/mysql
|
|
--ignore=tests/integration/profiler
|
|
--ignore=tests/integration/data_quality
|
|
steps:
|
|
- name: Free Disk Space (Ubuntu)
|
|
uses: jlumbroso/free-disk-space@main
|
|
with:
|
|
tool-cache: false
|
|
android: true
|
|
dotnet: true
|
|
haskell: true
|
|
large-packages: false
|
|
swap-storage: true
|
|
docker-images: false
|
|
|
|
- name: Wait for the labeler
|
|
uses: lewagon/wait-on-check-action@v1.3.4
|
|
if: ${{ github.event_name == 'pull_request_target' }}
|
|
with:
|
|
ref: ${{ github.event.pull_request.head.sha }}
|
|
check-name: Team Label
|
|
repo-token: ${{ secrets.GITHUB_TOKEN }}
|
|
wait-interval: 90
|
|
|
|
- name: Verify PR labels
|
|
uses: jesusvasquez333/verify-pr-label-action@v1.4.0
|
|
if: ${{ github.event_name == 'pull_request_target' }}
|
|
with:
|
|
github-token: "${{ secrets.GITHUB_TOKEN }}"
|
|
valid-labels: "safe to test"
|
|
pull-request-number: "${{ github.event.pull_request.number }}"
|
|
disable-reviews: true # To not auto approve changes
|
|
|
|
- name: Checkout
|
|
uses: actions/checkout@v4
|
|
with:
|
|
ref: ${{ github.event_name == 'merge_group' && github.sha || github.event.pull_request.head.sha }}
|
|
|
|
- name: Setup Openmetadata Test Environment
|
|
uses: ./.github/actions/setup-openmetadata-test-environment
|
|
with:
|
|
python-version: ${{ matrix.py-version}}
|
|
args: "-m no-ui"
|
|
ingestion_dependency: "mysql,elasticsearch,sample-data"
|
|
|
|
- name: Run Integration Tests
|
|
run: |
|
|
source env/bin/activate
|
|
cd ingestion
|
|
nox --no-venv -s integration-tests -- --standalone --durations=5 ${{ matrix.shard.nox-args }}
|
|
env:
|
|
TESTCONTAINERS_RYUK_DISABLED: true
|
|
shell: bash
|
|
|
|
- name: Upload coverage artifact
|
|
if: ${{ matrix.py-version == env.MAIN_PYTHON_VERSION && !cancelled() }}
|
|
uses: actions/upload-artifact@v4
|
|
with:
|
|
name: coverage-integration-${{ matrix.shard.name }}
|
|
path: ingestion/.coverage
|
|
include-hidden-files: true
|
|
|
|
- name: Clean Up
|
|
run: |
|
|
cd ./docker/development
|
|
docker compose down --remove-orphans
|
|
sudo rm -rf ${PWD}/docker-volume
|
|
|
|
# Single required-check gate for branch protection.
|
|
# Skipped (= "Success") when all test jobs pass or are legitimately skipped.
|
|
# Runs and exits 1 only when a test job fails or is cancelled.
|
|
# Set "py-tests / py-tests-status" as the sole required check for this workflow.
|
|
py-tests-status:
|
|
name: py-tests-status
|
|
needs: [changes, py-unit-tests, py-integration-tests]
|
|
if: ${{ failure() || cancelled() }}
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- run: exit 1
|
|
|
|
py-combine-coverage:
|
|
needs: [changes, py-unit-tests, py-integration-tests]
|
|
if: ${{ needs.changes.outputs.python == 'true' && !cancelled() }}
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 10
|
|
steps:
|
|
- name: Checkout
|
|
uses: actions/checkout@v4
|
|
with:
|
|
ref: ${{ github.event_name == 'merge_group' && github.sha || github.event.pull_request.head.sha }}
|
|
fetch-depth: 0
|
|
filter: blob:none
|
|
|
|
- name: Setup Python
|
|
uses: actions/setup-python@v5
|
|
with:
|
|
python-version: ${{ env.MAIN_PYTHON_VERSION }}
|
|
|
|
- name: Install uv
|
|
run: pip install uv
|
|
shell: bash
|
|
|
|
- name: Install coverage
|
|
run: |
|
|
python3 -m venv env
|
|
source env/bin/activate
|
|
uv pip install "coverage[toml]" nox
|
|
shell: bash
|
|
|
|
- name: Download coverage artifacts
|
|
uses: actions/download-artifact@v4
|
|
with:
|
|
pattern: coverage-*
|
|
path: ingestion/coverage-data/
|
|
|
|
- name: Prepare coverage files
|
|
run: |
|
|
cd ingestion
|
|
[ -f coverage-data/coverage-unit/.coverage ] && mv coverage-data/coverage-unit/.coverage .coverage.unit
|
|
for dir in coverage-data/coverage-integration-*/; do
|
|
shard=$(basename "$dir" | sed 's/coverage-integration-//')
|
|
[ -f "$dir/.coverage" ] && mv "$dir/.coverage" ".coverage.integration-$shard"
|
|
done
|
|
shell: bash
|
|
|
|
- name: Combine coverage
|
|
run: |
|
|
source env/bin/activate
|
|
cd ingestion
|
|
nox --no-venv -s combine-coverage
|
|
shell: bash
|
|
|
|
- name: Remove pom.xml
|
|
run: rm pom.xml
|
|
shell: bash
|
|
|
|
# we have to pass these args values since we are working with the 'pull_request_target' trigger
|
|
- name: Push Results in PR to Sonar
|
|
id: push-to-sonar
|
|
if: ${{ github.event_name == 'pull_request_target'}}
|
|
continue-on-error: true
|
|
uses: SonarSource/sonarqube-scan-action@v7
|
|
env:
|
|
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
|
SONAR_TOKEN: ${{ secrets.INGESTION_SONAR_SECRET }}
|
|
with:
|
|
projectBaseDir: ingestion/
|
|
args: ${{ env.SONAR_OPTS }}
|
|
|
|
# next two steps are for retrying "Push Results in PR to Sonar" step in case it fails
|
|
- name: Wait to retry 'Push Results in PR to Sonar'
|
|
if: ${{ github.event_name == 'pull_request_target' && steps.push-to-sonar.outcome != 'success' }}
|
|
run: sleep 20s
|
|
shell: bash
|
|
|
|
- name: Retry 'Push Results in PR to Sonar'
|
|
uses: SonarSource/sonarqube-scan-action@v7
|
|
if: ${{ github.event_name == 'pull_request_target' && steps.push-to-sonar.outcome != 'success' }}
|
|
env:
|
|
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
|
SONAR_TOKEN: ${{ secrets.INGESTION_SONAR_SECRET }}
|
|
with:
|
|
projectBaseDir: ingestion/
|
|
args: ${{ env.SONAR_OPTS }}
|