- Configures internal alb to log to the same bucket as the public alb
- Adds support for osquery-perf task size (cpu/memory) configuration
- Updates defaults for osquery-perf extra_flags
- Updates default enroll.sh loop sleep_time from 60s -> 300s
<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves # N/A
- Resolves an issue that prevents some locally pulled docker images from
being pushed to ECR.
<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves#41749
# Checklist for submitter
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
- Adds require_secure_transport for mysql connections to the db_cluster
parameter group for dogfood and loadtest environments.
```
db_cluster_parameters = {
require_secure_transport = "ON"
}
```
<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves#34677 and #35932
Adding ~450K software to the loadtest, including scripts to add more
software in the future.
Software is held in a `software.sql` file, which is used to create a
sqlite DB during osquery perf run/deployment.
# Checklist for submitter
## Testing
- [x] QA'd all new/changed functionality manually
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added support for loading software data from an external SQLite
database via a new `--software_db_path` command-line flag for more
realistic simulation scenarios.
* Added import and SQL generation tools to build and manage custom
software libraries.
* **Documentation**
* Added comprehensive README with setup instructions, tool usage, and
end-to-end workflow guidance for the software library.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves#34677
Changing OTEL set up to group exceptions by type, which is an
OTEL/industry best practice.
- Removes timestamp from osquery_perf image
- Adds `default: 0` to loadtest osquery_perf workflow, `variable:
loadtest_containers_starting_index`
- Adds `variable: sleep_time` to loadtest osquery_perf workflow
- Adds osquery_perf docker repository in ECR
- Adds support for `sleep_time` to `enroll.sh`
- Updates terraform variables to enforce `git_branch` or `git_tag` for
osquery_perf
- Removes `firehose` logging from loadtesting environment
- Sets `filesystem` logging in loadtesting environment
- Updates fleet image to 4.76.0 as the default value
- Updates `migrations` and `logging_alb` modules with latest versions
- Removes `(Coming Soon)` from
`infrastructure/loadtesting/infra/README.md` with regards to deployment
via Github Actions
- Moves Signoz steps to `.header.md` to preserve steps in generated
`README.md`
- Adds support for `enroll.sh`, to deploy osquery_perf in batches
- Merges variables `tag` and `git_branch` into `git_tag_branch`. Only
one tag or git_branch should be specified.
- Still used for osquery_perf to check out the correct tag/branch.
- Removes fleet_image requirement for cutting osquery_perf images
---------
Co-authored-by: Robert Fairburn <8029478+rfairburn@users.noreply.github.com>
<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves#34500
Terraform changes after my latest loadtest.
VPC consolidation: updated (and deployed) shared VPC so that Signoz
backend can now use it
- Removed eks-vpc/ directory
- Moved VPC management to shared/vpc.tf
- Updated shared/init.tf to reflect VPC changes
Infra improvements
- infra/internal_alb.tf - changed suffix from -internal to -int since I
hit max 32 characters issue
OTEL
- OTEL Collector configuration overrides for production stability
untested end-to-end but works as a replacement for plans and doesn't
require a local arm64 build to work.
Co-authored-by: Jorge Falcon <22119513+BCTBB@users.noreply.github.com>
SigNoz converted from child module to standalone root module with
independent state.
**Critical Impact**
Deployment order is now required:
1. Deploy infrastructure/loadtesting/terraform/signoz/ FIRST
2. Then deploy infrastructure/loadtesting/terraform/infra/
Communication between modules via Terraform remote state.
**Key Configuration Changes**
- SigNoz creates its own EKS cluster: signoz-${workspace}
- Instance type: t3.xlarge (upgraded from t3.large for resource
headroom)
- ClickHouse disk: 200Gi (was 20Gi) with 2-day retention
- Resource limits configured to prevent OOMKills during loadtest
- wait_for_jobs = false to avoid Helm deployment deadlock
<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves#32331
<!-- Add the related story/sub-task/bug number, like Resolves#123, or
remove if NA -->
**Related issue:** Resolves#32331
This PR allows us to run loadtest with SigNoz OTEL backend by adding
`-var=enable_otel=true`
SigNoz is deployed via Helm chart.
Enhancements needed (in future PR):
- put SigNoz UI behind VPN
- combine the new eks-vpc with shared fleet-vpc
- make SigNoz shared, so multiple loadtests use the same instance? (But
what about updating to it to latest version?)
Next steps:
- Enable SigNoz in Dogfood environment
- SigNoz by default [keeps 15 days of logs and
traces](https://signoz.io/docs/userguide/retention-period), which is
quite a bit. How much would that cost us and should we reduce it?
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- New Features
- Optional OpenTelemetry tracing with SigNoz via a new enable_otel flag.
- Conditional deployment of a SigNoz stack (managed EKS, storage,
Helm-based apps) with internal OTLP collector endpoint.
- New outputs to retrieve OTLP endpoint, cluster name, and a kubectl
configuration command.
- Documentation
- Added guidance for deploying and using SigNoz with load testing.
- Updated examples to include -var=enable_otel=true.
- Chores
- Introduced required providers to support Helm and Kubernetes
resources.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
# Github Actions (New)
- New workflow to deploy/destroy loadtest infrastructure with one-click
(Needs to be tested)
- Common inputs drive configuration and deployment of loadtest
infrastructure
- tag
- fleet_task_count
- fleet_task_memory
- fleet_task_cpu
- fleet_database_instance_size
- fleet_database_instance_count
- fleet_redis_instance_size
- fleet_redis_instance_count
- terraform_workspace
- terraform_action
- New workflow to deploy/destroy osquery-perf to loadtest infrastructure
with one-click (Needs to be tested)
- Common inputs drive configuration and deployment of osquery-perf
resources
- tag
- git_branch
- loadtest_containers
- extra_flags
- terraform_workspace
- terraform_action
- New workflow to deploy shared loadtest resources with one-click (Needs
to be tested)
# Loadtest Infrastructure (New)
- New directory (`infrastructure/loadtesting/terraform/infra`) for
one-click deployment
- Loadtest environment updated to use [fleet-terraform
modules](https://github.com/fleetdm/fleet-terraform)
- [Deployment documentation
updated](0c254bca40/infrastructure/loadtesting/terraform/infra/README.md)
to reflect new steps
# Osquery-perf deployment (New)
- New directory (`infrastructure/loadtesting/terraform/osquery-perf`)
for the deployment of osquery-perf
- osquery-perf updated to use [fleet-terraform
modules](https://github.com/fleetdm/fleet-terraform)
- [Deployment documentation
updated](0c254bca40/infrastructure/loadtesting/terraform/osquery_perf)
to reflect new steps
Bumping memory and cpu on aws load test containers Creating multiple ecs
services with a single task. This allows us to specify different
settings per osquery perf container/task.
**Related issue:** No issue.
Ran
```
make update-go version=1.24.6
```
And then updated the `sha256`s manually in the Dockerfiles.
Fixes https://nvd.nist.gov/vuln/detail/CVE-2025-47907
```
Cancelling a query (e.g. by cancelling the context passed to one of the query methods) during a call
to the Scan method of the returned Rows can result in unexpected results if other queries are being
made in parallel. This can result in a race condition that may overwrite the expected results with those
of another query, causing the call to Scan to return either unexpected results from the other
query or an error.
```
# Added
- Added kms.tf to support encrypting keys, specifically cloudfront keys.
- Added template/cloudfront.tf.disabled for use in enabling cloudfront.-
Modified ecs-iam.tf to support log-alb.tf, cloudfront.tf policies that
are injected into `local.extra_execution_iam_policies` and `local.iam`.
- Added log-alb.tf to enable logging alb, required by cloudfront.tf.
# Changed
- Modified ecs.tf to support adding of additional secrets from
`local.secrets`.
- Modified firehose.tf to support provider required updates for
deprecated resource configurations.
- Modified init.tf to support `> v5.0` of `hashicorp/aws` provider.
- Modified locals.tf to add `extra_execution_iam_policies`, `iam`,
`software_installers_kms_policy`, `extra_secrets`, secrets, and
`cloudfront_key_basename`, to support cloudfront.
- Modified readme.md with instructions on how to enable cloudfront.tf
- Modified redis.tf to support provider required updates for deprecated
resource configurations
- Modified s3.tf to support kms keys and add kms iam.
- Modified terraform version in .github/workflows/tfvalidate.yml - 1.9.0
-> 1.10.4
## #30730
- Update Go version
- Update the docs for this process
- Confirmed `fleet`, `fleetctl`, and related docker images build
successfully
- Note that failing tests are unrelated: see [Slack
thread](https://fleetdm.slack.com/archives/C019WG4GH0A/p1752175318523689)
---------
Co-authored-by: Jacob Shandling <jacob@fleetdm.com>
For #28837.
Fixing this all of this because we got multiple reports from the
community and customers and these were also detected by Amazon
Inspector.
- Fixes CVE-2025-22871 by upgrading Go from 1.24.1 to 1.24.2.
- `docker scout` now fails the daily scheduled action if there are
CRITICAL,HIGH CVEs (we missed setting `exit-code: true`).
- Report CVE-2025-46569 as not affected by it because of our use of
OPA's go package.
- Report CVE-2024-8260 as not affected by it because Fleet doesn't run
on Windows.
- The `security/status.md` shows a lot of changes because we are now
sorting CVEs so that newest come first.
---
- [X] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files)
for more information.
- [ ] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [ ] Make sure fleetd is compatible with the latest released version of
Fleet (see [Must
rule](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/fleetd-development-and-release-strategy.md)).
- [ ] Orbit runs on macOS, Linux and Windows. Check if the orbit
feature/bugfix should only apply to one platform (`runtime.GOOS`).
- [ ] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [ ] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
- [ ] For unreleased bug fixes in a release candidate, confirmed that
the fix is not expected to adversely impact load test results or alerted
the release DRI if additional load testing is needed.
For #26713
# Details
This PR updates Fleet and its related tools and binaries to use Go
version 1.24.1.
Scanning through the changelog, I didn't see anything relevant to Fleet
that requires action. The only possible breaking change I spotted was:
> As [announced](https://tip.golang.org/doc/go1.23#linux) in the Go 1.23
release notes, Go 1.24 requires Linux kernel version 3.2 or later.
Linux kernel 3.2 was released in January of 2012, so I think we can
commit to dropping support for earlier kernel versions.
The new [tools directive](https://tip.golang.org/doc/go1.24#tools) is
interesting as it means we can move away from using `tools.go` files,
but it's not a required update.
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [X] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [X] Make sure fleetd is compatible with the latest released version of
Fleet
- [x] Orbit runs on macOS ✅ , Linux ✅ and Windows.
- [x] Manual QA must be performed in the three main OSs, macOS ✅,
Windows and Linux ✅.