<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves#32331
This PR allows us to run loadtest with SigNoz OTEL backend by adding
`-var=enable_otel=true`
SigNoz is deployed via Helm chart.
Enhancements needed (in future PR):
- put SigNoz UI behind VPN
- combine the new eks-vpc with shared fleet-vpc
- make SigNoz shared, so multiple loadtests use the same instance? (But
what about updating to it to latest version?)
Next steps:
- Enable SigNoz in Dogfood environment
- SigNoz by default [keeps 15 days of logs and
traces](https://signoz.io/docs/userguide/retention-period), which is
quite a bit. How much would that cost us and should we reduce it?
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- New Features
- Optional OpenTelemetry tracing with SigNoz via a new enable_otel flag.
- Conditional deployment of a SigNoz stack (managed EKS, storage,
Helm-based apps) with internal OTLP collector endpoint.
- New outputs to retrieve OTLP endpoint, cluster name, and a kubectl
configuration command.
- Documentation
- Added guidance for deploying and using SigNoz with load testing.
- Updated examples to include -var=enable_otel=true.
- Chores
- Introduced required providers to support Helm and Kubernetes
resources.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
# Github Actions (New)
- New workflow to deploy/destroy loadtest infrastructure with one-click
(Needs to be tested)
- Common inputs drive configuration and deployment of loadtest
infrastructure
- tag
- fleet_task_count
- fleet_task_memory
- fleet_task_cpu
- fleet_database_instance_size
- fleet_database_instance_count
- fleet_redis_instance_size
- fleet_redis_instance_count
- terraform_workspace
- terraform_action
- New workflow to deploy/destroy osquery-perf to loadtest infrastructure
with one-click (Needs to be tested)
- Common inputs drive configuration and deployment of osquery-perf
resources
- tag
- git_branch
- loadtest_containers
- extra_flags
- terraform_workspace
- terraform_action
- New workflow to deploy shared loadtest resources with one-click (Needs
to be tested)
# Loadtest Infrastructure (New)
- New directory (`infrastructure/loadtesting/terraform/infra`) for
one-click deployment
- Loadtest environment updated to use [fleet-terraform
modules](https://github.com/fleetdm/fleet-terraform)
- [Deployment documentation
updated](0c254bca40/infrastructure/loadtesting/terraform/infra/README.md)
to reflect new steps
# Osquery-perf deployment (New)
- New directory (`infrastructure/loadtesting/terraform/osquery-perf`)
for the deployment of osquery-perf
- osquery-perf updated to use [fleet-terraform
modules](https://github.com/fleetdm/fleet-terraform)
- [Deployment documentation
updated](0c254bca40/infrastructure/loadtesting/terraform/osquery_perf)
to reflect new steps
Bumping memory and cpu on aws load test containers Creating multiple ecs
services with a single task. This allows us to specify different
settings per osquery perf container/task.
**Related issue:** No issue.
Ran
```
make update-go version=1.24.6
```
And then updated the `sha256`s manually in the Dockerfiles.
Fixes https://nvd.nist.gov/vuln/detail/CVE-2025-47907
```
Cancelling a query (e.g. by cancelling the context passed to one of the query methods) during a call
to the Scan method of the returned Rows can result in unexpected results if other queries are being
made in parallel. This can result in a race condition that may overwrite the expected results with those
of another query, causing the call to Scan to return either unexpected results from the other
query or an error.
```
# Added
- Added kms.tf to support encrypting keys, specifically cloudfront keys.
- Added template/cloudfront.tf.disabled for use in enabling cloudfront.-
Modified ecs-iam.tf to support log-alb.tf, cloudfront.tf policies that
are injected into `local.extra_execution_iam_policies` and `local.iam`.
- Added log-alb.tf to enable logging alb, required by cloudfront.tf.
# Changed
- Modified ecs.tf to support adding of additional secrets from
`local.secrets`.
- Modified firehose.tf to support provider required updates for
deprecated resource configurations.
- Modified init.tf to support `> v5.0` of `hashicorp/aws` provider.
- Modified locals.tf to add `extra_execution_iam_policies`, `iam`,
`software_installers_kms_policy`, `extra_secrets`, secrets, and
`cloudfront_key_basename`, to support cloudfront.
- Modified readme.md with instructions on how to enable cloudfront.tf
- Modified redis.tf to support provider required updates for deprecated
resource configurations
- Modified s3.tf to support kms keys and add kms iam.
- Modified terraform version in .github/workflows/tfvalidate.yml - 1.9.0
-> 1.10.4
## #30730
- Update Go version
- Update the docs for this process
- Confirmed `fleet`, `fleetctl`, and related docker images build
successfully
- Note that failing tests are unrelated: see [Slack
thread](https://fleetdm.slack.com/archives/C019WG4GH0A/p1752175318523689)
---------
Co-authored-by: Jacob Shandling <jacob@fleetdm.com>
- Adding `FLEET_MICROSOFT_COMPLIANCE_PARTNER_PROXY_API_KEY` to dogfood
- Adding creation of secret and secret version for
`FLEET_MICROSOFT_COMPLIANCE_PARTNER_PROXY_API_KEY` value
- Disabling firehose log destination
- Re-enabling webhook log destination
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Enabled webhook logging by activating environment variables for
webhook URLs.
* Webhook log plugin is now conditionally set based on the presence of a
webhook URL.
* **Chores**
* Updated environment variable management by removing firehose-logging
addon variables from the configuration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
- Disabled webhook variables
- Re-enabled firehose variables
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Disabled certain environment variables related to webhook logging.
* Updated environment variable configuration to include additional
logging settings.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Update dogfood deployment to utilize webhooks for the osquery results
logging destination configuration
@BCTBB already added a tines.io webhook URL to the repo secrets
`DOGFOOD_WEBHOOK_URL` where the value was provided by @harrisonravazzolo
Co-authored-by: Harrison Ravazzolo <38767391+harrisonravazzolo@users.noreply.github.com>
Fixed restore policies to be tied to the correct policy names
- restore -> restore vs restore -> backup
- backup -> backup vs backup -> restore
Fixing the typos. Permissions remain unchanged.
- Will create AWS Backup
- Source Vault and KMS key
- Destination Vault and KMS key
- Backup Plan
- Backup Selection
- Required permissions
- Set permissions required for AWS backup on GHA role (pre-added
manually)
- Set `Tag:backup=true` on Dogfood Aurora clusters via `rds_config`