OpenMetadata/docker/development
Pere Miquel Brull 852b42cc5d Fix k8s operator exit handler pod loop and TTL cleanup, add tolerations (#26971)
* Fix k8s operator exit handler pod loop and TTL cleanup, add tolerations support (#26772)

Fix two bugs in the OMJob operator:
- Exit handler pods were recreated indefinitely because findExitHandlerPod()
  lacked the name-based fallback that findMainPod() already had, causing
  label propagation delays to trigger repeated pod creation events
- Terminal phase handler never rescheduled for TTL-based cleanup, so pods
  were never cleaned up after ttlSecondsAfterFinished expired

Add tolerations support for ingestion pod scheduling across the full stack:
- Operator: OMJobPodSpec field, PodManager.buildPod(), CRD schema
- Server: OMJob model, K8sPipelineClientConfig parsing, K8sPipelineClient
  builder, K8sJobUtils serialization

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add K8S_TOLERATIONS env var mapping in openmetadata.yaml

Adds the tolerations config binding so the server picks up the
K8S_TOLERATIONS env var set by the Helm chart secret.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add tolerations to k8s test values for local validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix cleanup

* Address PR review: remove redundant pod lookup and guard null items

- Remove redundant server-created pod selector fallback in findMainPod()
  since buildPodSelector() now matches all pods by omjob-name alone
- Add null guard for getItems() in deletePods() to prevent NPE
- Update local test values for namespace and image config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit cfd71e8bd3)
2026-04-14 07:44:17 +00:00
..
distributed-test Improve indexing (#26154) 2026-03-03 16:41:16 +05:30
helm Fix k8s operator exit handler pod loop and TTL cleanup, add tolerations (#26971) 2026-04-14 07:44:17 +00:00
mock-oidc-provider MSAL Token Renewal Fix — Safari Session Loss (#27214) 2026-04-10 10:13:39 +05:30
.env.sso-test MSAL Token Renewal Fix — Safari Session Loss (#27214) 2026-04-10 10:13:39 +05:30
docker-compose-fuseki.yml MINOR - Prepare extra validations for system repository health (#24846) 2025-12-18 07:37:37 +01:00
docker-compose-gcp.yml [Search] Upgrade Clients (#25719) 2026-02-07 18:54:13 +05:30
docker-compose-postgres-fuseki.yml [Search] Upgrade Clients (#25719) 2026-02-07 18:54:13 +05:30
docker-compose-postgres.yml [Search] Upgrade Clients (#25719) 2026-02-07 18:54:13 +05:30
docker-compose.yml MSAL Token Renewal Fix — Safari Session Loss (#27214) 2026-04-10 10:13:39 +05:30
Dockerfile Fix #21506: Upgrade to Java 21 (#21507) 2025-06-11 22:06:08 -07:00