for #19930 # Checklist for submitter - [X] Changes file added for user-visible changes in `changes/`, `orbit/changes/` or `ee/fleetd-chrome/changes`. - [X] Input data is properly validated, `SELECT *` is avoided, SQL injection is prevented (using placeholders for values in statements) - [X] Added/updated tests - [X] If database migrations are included, checked table schema to confirm autoupdate - [X] Manual QA for all new/changed functionality # Details This PR adds a new feature to the existing monitoring add-on. The add-on will now send an SNS alert whenever a scheduled job like "vulnerabilities" or "apple_mdm_apns_pusher" exits early due to errors. The alert contains the job type and the set of errors (there can be multiple, since jobs can have multiple sub-jobs). By default the SNS topic for this new alert is the same as the one for the existing cron system alerts, but it can be configured to use a separate topic (e.g. dogfood instance will post to a separate slack channel). The actual changes are: **On the server side:** - Add errors field to cron_stats table (json DEFAULT NULL) - Added errors var to `Schedule` struct to collect errors from jobs - In `RunAllJobs`, collect err from job into new errors var - Update `Schedule.updateStats`and `CronStats.UpdateCronStats`to accept errors argument - If provided, update errors field of cron_stats table **On the monitor side:** - Add new SQL query to look for all completed schedules since last run with non-null errors - send SNS with job ID, name, errors # Testing New automated testing was added for the functional code that gathers and stores errors from cron runs in the database. To test the actual Lambda, I added a row in my `cron_stats` table with errors, then compiled and ran the Lambda executable locally, pointing it to my local mysql and localstack instances: ``` 2024/12/03 14:43:54 main.go:258: Lambda execution environment not found. Falling back to local execution. 2024/12/03 14:43:54 main.go:133: Connected to database! 2024/12/03 14:43:54 main.go:161: Row vulnerabilities last updated at 2024-11-27 03:30:03 +0000 UTC 2024/12/03 14:43:54 main.go:163: *** 1h hasn't updated in more than vulnerabilities, alerting! (status completed) 2024/12/03 14:43:54 main.go:70: Sending SNS Message 2024/12/03 14:43:54 main.go:74: Sending 'Environment: dev Message: Fleet cron 'vulnerabilities' hasn't updated in more than 1h. Last status was 'completed' at 2024-11-27 03:30:03 +0000 UTC.' to 'arn:aws:sns:us-east-1:000000000000:topic1' 2024/12/03 14:43:54 main.go:82: { MessageId: "260864ff-4cc9-4951-acea-cef883b2de5f" } 2024/12/03 14:43:54 main.go:198: *** mdm_apple_profile_manager job had errors, alerting! (errors {"something": "wrong"}) 2024/12/03 14:43:54 main.go:70: Sending SNS Message 2024/12/03 14:43:54 main.go:74: Sending 'Environment: dev Message: Fleet cron 'mdm_apple_profile_manager' (last updated 2024-12-03 20:34:14 +0000 UTC) raised errors during its run: {"something": "wrong"}.' to 'arn:aws:sns:us-east-1:000000000000:topic1' 2024/12/03 14:43:54 main.go:82: { MessageId: "5cd085ef-89f6-42c1-8470-d80a22b295f8" |
||
|---|---|---|
| .. | ||
| addons | ||
| byo-vpc | ||
| example | ||
| .gitignore | ||
| .header.md | ||
| .terraform-docs.yml | ||
| main.tf | ||
| outputs.tf | ||
| README.md | ||
| variables.tf | ||
This module provides a basic Fleet setup. This assumes that you bring nothing to the installation. If you want to bring your own VPC/database/cache nodes/ECS cluster, then use one of the submodules provided.
To quickly list all available module versions you can run:
git tag |grep '^tf'
The following is the module layout, so you can navigate to the module that you want:
- Root module (use this to get a Fleet instance ASAP with minimal setup)
- BYO-VPC (use this if you want to install Fleet inside an existing VPC)
- BYO-database (use this if you want to use an existing database and cache node)
- BYO-ECS (use this if you want to bring your own everything but Fleet ECS services)
- BYO-database (use this if you want to use an existing database and cache node)
- BYO-VPC (use this if you want to install Fleet inside an existing VPC)
Migrating from existing Dogfood code
The below code describes how to migrate from existing Dogfood code
moved {
from = module.vpc
to = module.main.module.vpc
}
moved {
from = module.aurora_mysql
to = module.main.module.byo-vpc.module.rds
}
moved {
from = aws_elasticache_replication_group.default
to = module.main.module.byo-vpc.module.redis.aws_elasticache_replication_group.default
}
This focuses on the resources that are "heavy" or store data. Note that the ALB cannot be moved like this because Dogfood uses the aws_alb resource and the module uses the aws_lb resource. The resources are aliases of eachother, but Terraform can't recognize that.
How to improve this module
If this module somehow doesn't fit your needs, feel free to contact us by opening a ticket, or contacting your contact at Fleet. Our goal is to make this module fit all needs within AWS, so we will try to find a solution so that this module fits your needs.
If you want to make the changes yourself, simply make a PR into main with your additions. We would ask that you make sure that variables are defined as null if there is no default that makes sense and that variable changes are reflected all the way up the stack.
How to update this readme
Edit .header.md and run terraform-docs markdown . > README.md
Requirements
| Name | Version |
|---|---|
| terraform | >= 1.3.8 |
Providers
No providers.
Modules
| Name | Source | Version |
|---|---|---|
| byo-vpc | ./byo-vpc | n/a |
| vpc | terraform-aws-modules/vpc/aws | 5.1.2 |
Resources
No resources.
Inputs
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| alb_config | n/a | object({ |
{} |
no |
| certificate_arn | n/a | string |
n/a | yes |
| ecs_cluster | The config for the terraform-aws-modules/ecs/aws module | object({ |
{ |
no |
| fleet_config | The configuration object for Fleet itself. Fields that default to null will have their respective resources created if not specified. | object({ |
{ |
no |
| migration_config | The configuration object for Fleet's migration task. | object({ |
{ |
no |
| rds_config | The config for the terraform-aws-modules/rds-aurora/aws module | object({ |
{ |
no |
| redis_config | n/a | object({ |
{ |
no |
| vpc | n/a | object({ |
{ |
no |
Outputs
| Name | Description |
|---|---|
| byo-vpc | n/a |
| vpc | n/a |