Fixes part of #29720.. Includes promoting some bolded to subheadings (in
wireframe docs).
---------
Co-authored-by: Sam Pfluger <108141731+Sampfluger88@users.noreply.github.com>
for #25267
This was missed in the TF config for the cron-monitoring Lambda updates,
leading to a failure when trying to publish to the `#help-p2` topic.
for #19930
# Checklist for submitter
- [X] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)
- [X] Added/updated tests
- [X] If database migrations are included, checked table schema to
confirm autoupdate
- [X] Manual QA for all new/changed functionality
# Details
This PR adds a new feature to the existing monitoring add-on. The add-on
will now send an SNS alert whenever a scheduled job like
"vulnerabilities" or "apple_mdm_apns_pusher" exits early due to errors.
The alert contains the job type and the set of errors (there can be
multiple, since jobs can have multiple sub-jobs). By default the SNS
topic for this new alert is the same as the one for the existing cron
system alerts, but it can be configured to use a separate topic (e.g.
dogfood instance will post to a separate slack channel).
The actual changes are:
**On the server side:**
- Add errors field to cron_stats table (json DEFAULT NULL)
- Added errors var to `Schedule` struct to collect errors from jobs
- In `RunAllJobs`, collect err from job into new errors var
- Update `Schedule.updateStats`and `CronStats.UpdateCronStats`to accept
errors argument
- If provided, update errors field of cron_stats table
**On the monitor side:**
- Add new SQL query to look for all completed schedules since last run
with non-null errors
- send SNS with job ID, name, errors
# Testing
New automated testing was added for the functional code that gathers and
stores errors from cron runs in the database. To test the actual Lambda,
I added a row in my `cron_stats` table with errors, then compiled and
ran the Lambda executable locally, pointing it to my local mysql and
localstack instances:
```
2024/12/03 14:43:54 main.go:258: Lambda execution environment not found. Falling back to local execution.
2024/12/03 14:43:54 main.go:133: Connected to database!
2024/12/03 14:43:54 main.go:161: Row vulnerabilities last updated at 2024-11-27 03:30:03 +0000 UTC
2024/12/03 14:43:54 main.go:163: *** 1h hasn't updated in more than vulnerabilities, alerting! (status completed)
2024/12/03 14:43:54 main.go:70: Sending SNS Message
2024/12/03 14:43:54 main.go:74: Sending 'Environment: dev
Message: Fleet cron 'vulnerabilities' hasn't updated in more than 1h. Last status was 'completed' at 2024-11-27 03:30:03 +0000 UTC.' to 'arn:aws:sns:us-east-1:000000000000:topic1'
2024/12/03 14:43:54 main.go:82: {
MessageId: "260864ff-4cc9-4951-acea-cef883b2de5f"
}
2024/12/03 14:43:54 main.go:198: *** mdm_apple_profile_manager job had errors, alerting! (errors {"something": "wrong"})
2024/12/03 14:43:54 main.go:70: Sending SNS Message
2024/12/03 14:43:54 main.go:74: Sending 'Environment: dev
Message: Fleet cron 'mdm_apple_profile_manager' (last updated 2024-12-03 20:34:14 +0000 UTC) raised errors during its run:
{"something": "wrong"}.' to 'arn:aws:sns:us-east-1:000000000000:topic1'
2024/12/03 14:43:54 main.go:82: {
MessageId: "5cd085ef-89f6-42c1-8470-d80a22b295f8"
Kept the Debian-based image at Bullseye.
For #24517.
# Checklist for submitter
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality: Validated builds
(Docker and make) all work after this change, except for BitLocker
(requires Windows).
Add an example module to the readme
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [ ] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files)
for more information.
- [ ] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)
- [ ] Added support on fleet's osquery simulator `cmd/osquery-perf` for
new osquery data ingestion features.
- [ ] Added/updated tests
- [ ] If paths of existing endpoints are modified without backwards
compatibility, checked the frontend/CLI for any necessary changes
- [ ] If database migrations are included, checked table schema to
confirm autoupdate
- For database migrations:
- [ ] Checked schema for all modified table for columns that will
auto-update timestamps during migration.
- [ ] Confirmed that updating the timestamps is acceptable, and will not
cause unwanted side effects.
- [ ] Ensured the correct collation is explicitly set for character
columns (`COLLATE utf8mb4_unicode_ci`).
- [ ] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [ ] Orbit runs on macOS, Linux and Windows. Check if the orbit
feature/bugfix should only apply to one platform (`runtime.GOOS`).
- [ ] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [ ] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
for #21440
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
for #21198
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
for #19844
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality