fleet/.github/workflows
Scott Gress 4a623812e8
Add notification for dogfood GitOps failures (#24402)
for #19106 

This PR adds a Slack notification when the GitOps run fails in the
dogfood-gitops workflow. Whenever the actual GitOps action fails, it
should notify #help-dogfooding with a link to the failed action. Note
that this will alert on both merges to main and scheduled runs, which I
think we want. Also note that this is [currently failing on
main](https://github.com/fleetdm/fleet/actions/runs/12154006118) so this
alert will start going off daily until the issue is fixed 😶

### > Note: this will need a new Slack incoming webhook for sending
messages to #help-dogfooding, and a new
`SLACK_G_HELP_DOGFOODING_WEBHOOK_URL` repo secret with the webhook URL.

I tested this on a personal private repo just to make sure I got all the
syntax right:

<img width="422" alt="image"
src="https://github.com/user-attachments/assets/74d188eb-5c03-471b-a5db-9f578a56e2ab">
2024-12-10 13:39:32 -06:00
..
config Add summary to test-go.yml Slack message when it fails (#18188) 2024-04-10 18:04:26 -03:00
build-and-check-fleetctl-docker-and-deps.yml Fix trivy fleetctl workflow (#23643) 2024-11-12 14:58:41 -03:00
build-binaries.yaml Use node version defined in package.json (#22504) 2024-10-01 17:38:22 -03:00
build-fleetd-base-pkg.yml Add workflow to ease QA of ADE workflows (#23470) 2024-11-05 05:28:50 -03:00
build-fleetd_tables.yaml Fix build-fleetd_tables.yml workflow (#23875) 2024-11-15 19:41:19 -03:00
build-orbit.yaml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
check-automated-doc.yml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
check-tuf-timestamps.yml Add expiration checks for targets and snapshot roles (#24081) 2024-11-22 14:46:03 -03:00
code-sign-windows.yml Windows orbit.exe and fleet-desktop.exe are now signed. (#18201) 2024-04-26 12:46:23 -05:00
codeql-analysis.yml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
dependency-review.yml [StepSecurity] Apply security best practices (#17811) 2024-03-22 16:19:11 -05:00
deploy-fleet-website.yml Change Ubuntu version in Heroku deploy workflows (#22939) 2024-10-15 16:20:12 -05:00
deploy-vulnerability-dashboard.yml Change Ubuntu version in Heroku deploy workflows (#22939) 2024-10-15 16:20:12 -05:00
docs.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
dogfood-deploy.yml Deploy to free.fleetdm.com before dogfood (#23762) 2024-11-13 11:17:23 -06:00
dogfood-gitops.yml Add notification for dogfood GitOps failures (#24402) 2024-12-10 13:39:32 -06:00
fleet-and-orbit.yml bugfix: orbit linux zenity progress windows (#24280) 2024-12-05 08:02:03 -07:00
fleetctl-preview-latest.yml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
fleetctl-preview.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
fleetd-tuf.yml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
generate-desktop-targets.yml Release fleetd 1.36.0 (#24136) 2024-11-25 16:34:09 -03:00
generate-nudge-targets.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
generate-osqueryd-targets.yml Release osqueryd 5.14.1 (#23045) 2024-10-22 12:27:00 -05:00
golangci-lint.yml Updating golangci-lint to 1.61.0 (#22973) 2024-10-18 12:38:26 -05:00
goreleaser-fleet.yaml Use goreleaser v2 in CI (#23748) 2024-12-03 16:15:31 -06:00
goreleaser-orbit.yaml Use goreleaser v2 in CI (#23748) 2024-12-03 16:15:31 -06:00
goreleaser-snapshot-fleet.yaml Use goreleaser v2 in CI (#23748) 2024-12-03 16:15:31 -06:00
integration.yml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
pr-helm.yaml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
README.md add concurrency to ci (#8271) 2022-10-24 14:01:00 -06:00
release-fleetctl-docker-deps.yaml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
release-fleetd-base.yml Add retry to fleetd base pkg build. (#24489) 2024-12-09 13:24:38 -06:00
release-fleetd-chrome-beta.yml Keep all fleetd-base and fleetd-chrome artifacts. (#19749) 2024-06-17 15:49:06 -05:00
release-fleetd-chrome.yml Keep all fleetd-base and fleetd-chrome artifacts. (#19749) 2024-06-17 15:49:06 -05:00
release-helm.yaml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
render-deploy.yml Added Render deploy workflow for fleet-gitops CI. (#23190) 2024-10-25 15:55:42 -05:00
scorecards-analysis.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
test-bulk-operations-dashboard-changes.yml Add app to manage scripts and profiles. (#21450) 2024-08-22 14:59:15 -06:00
test-db-changes.yml Exclude migration tests from migration timestamp/ordering check (#22496) 2024-09-30 09:53:19 -05:00
test-fleetd-chrome.yml Fix issues with coverage uploads (#21736) 2024-09-03 09:07:16 -05:00
test-go.yaml Update nanomdm dependency with latest bug fixes and improvements. (#23906) 2024-11-20 11:47:11 -06:00
test-js.yml Fix lint-js (#22557) 2024-10-01 18:25:17 -03:00
test-native-tooling-packaging.yml Add CI check to detect issues with pushed fleetdm/fleetctl docker image (#22020) 2024-09-16 13:05:28 -03:00
test-packaging-build-docker-deps.yml Add CI check to detect issues with pushed fleetdm/fleetctl docker image (#22020) 2024-09-16 13:05:28 -03:00
test-packaging.yml Reduce test-packaging.yml runs on main (#22670) 2024-10-04 16:58:38 -03:00
test-puppet.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
test-vulnerability-dashboard-changes.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
test-website.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
test-yml-specs.yml Attempt to use go.mod version instead of hidden Github var (#21768) 2024-09-03 20:49:50 -03:00
tfvalidate.yml [StepSecurity] ci: Harden GitHub Actions (#17780) 2024-03-22 15:32:23 -05:00
trivy-scan.yml Fix rate limiting issue in Trivy workflow scan (#23634) 2024-11-07 15:06:17 -06:00
tuf-update-timestamp.yaml Add github action to automate timestamp update (#24074) 2024-11-27 16:13:54 -03:00
update-certs.yml Add reviewers to automated PRs (#18390) 2024-04-18 10:51:07 -03:00
update-osquery-versions.yml [StepSecurity] ci: Harden GitHub Actions (#23765) 2024-11-13 10:43:13 -06:00
verify-fleetd-base.yml Add retry to fleetd base pkg build. (#24489) 2024-12-09 13:24:38 -06:00

Github Actions

Fleet uses Github Actions for continuous integration (CI). This document describes best practices and at patterns for writing and maintaining Fleet's Github Actions workflows.

Bash

By default, Github Actions sets the shell to bash -e for linux and MacOS runners. To help write safer bash scripts in run jobs and avoid common issues, override the default by adding the following to the workflow file

defaults:
  run:
    # fail-fast using bash -eo pipefail. See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#exit-codes-and-error-action-preference
    shell: bash

By specifying the default shell to bash, some extra flags are set. The option pipefail changes the behaviour when using the pipe | operator such that if any command in a pipeline fails, that commands return code will be used a the return code for the whole pipeline. Consider the following example in test-go.yaml

    - name: Run Go Tests
      run: |
        # omitted ...
          make test-go 2>&1 | tee /tmp/gotest.log

If the pipefail option was not set, this job would always succeed because tee would always return success. This is not the intended behavior. Instead, we want the job to fail if make test-go fails.

Concurrency

Github Action runners are limited. If a lot of workflows are queued, they will wait in pending until a runner becomes available. This has caused issue in the past where workflows take an excessively long time to start. To help with this issue, use the following in workflows

# This allows a subsequently queued workflow run to interrupt previous runs
concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id}}
  cancel-in-progress: true

When a workflow is triggered via a pull request, it will cancel previous running workflows for that pull request. This is especially useful when changes are pushed to a pull request frequently. Manually triggered workflows, workflows that run on a schedule, and workflows triggered by pushes to main are unaffected.