Fixes#32313
OpenTelemetry Tracing
- Added tracing to async task collectors: FlushHostsLastSeen,
collectHostsLastSeen, collectLabelQueryExecutions,
collectPolicyQueryExecutions, collectScheduledQueryStats
- Updated HTTP middleware to use OTEL semantic convention for span names
({method} {route})
- Added OTELEnabled() helper to FleetConfig
Optimizations
- Reduced OTEL batch size from 512 to 256 spans to prevent gRPC message
size errors
- Enabled gzip compression for trace exports
NOTE: I tried to improve OTEL instrumentation for cron jobs, but it got
too complicated due to goroutines in `schedule.go` so that effort should
be separate. We do have SQL instrumentation for cron jobs, but we are
missing root spans for cron jobs as a whole.
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
## Testing
- [x] QA'd all new/changed functionality manually
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Expanded OpenTelemetry tracing for async tasks (host last seen, label
membership, policy membership, scheduled query stats) to provide richer
observability.
* More descriptive HTTP span names using “METHOD /route” for clearer
trace analysis.
* **Bug Fixes**
* Improved OTLP gRPC exporter reliability by enabling gzip compression
and reducing export batch size, mitigating intermittent gRPC errors.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
For #26713
# Details
This PR updates Fleet and its related tools and binaries to use Go
version 1.24.1.
Scanning through the changelog, I didn't see anything relevant to Fleet
that requires action. The only possible breaking change I spotted was:
> As [announced](https://tip.golang.org/doc/go1.23#linux) in the Go 1.23
release notes, Go 1.24 requires Linux kernel version 3.2 or later.
Linux kernel 3.2 was released in January of 2012, so I think we can
commit to dropping support for earlier kernel versions.
The new [tools directive](https://tip.golang.org/doc/go1.24#tools) is
interesting as it means we can move away from using `tools.go` files,
but it's not a required update.
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [X] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [X] Make sure fleetd is compatible with the latest released version of
Fleet
- [x] Orbit runs on macOS ✅ , Linux ✅ and Windows.
- [x] Manual QA must be performed in the three main OSs, macOS ✅,
Windows and Linux ✅.
`go-kit/kit/log` was deprecated and generating warnings
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Manual QA for all new/changed functionality
#16480
# Checklist for submitter
- [x] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Added/updated tests
- [x] Manual QA for all new/changed functionality
#13643
Updating the `policies` table to use a checksum column for uniqueness.
The checksum is computed with team_id (which may be null) and name. This
change is modeled on the checksum in the software table.
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Added/updated tests
- [x] If database migrations are included, checked table schema to
confirm autoupdate
- For database migrations:
- [x] Checked schema for all modified table for columns that will
auto-update timestamps during migration.
- [x] Confirmed that updating the timestamps is acceptable, and will not
cause unwanted side effects.
- [x] Manual QA for all new/changed functionality
#15472
# Checklist for submitter
- [x] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
- [x] Added/updated tests
- [x] Manual QA for all new/changed functionality
For #13715, this:
- Upgrades the Go version to `1.21.1`, infrastructure changes are
addressed separately at https://github.com/fleetdm/fleet/pull/13878
- Upgrades the linter version, as the current version doesn't work well
after the Go upgrade
- Fixes new linting errors (we now get errors for memory aliasing in
loops! 🎉 )
After this is merged people will need to:
1. Update their Go version. I use `gvm` and I did it like:
```
$ gvm install go1.21.1
$ gvm use go1.21.1 --default
```
2. Update the local version of `golangci-lint`:
```
$ go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.54.2
```
3. (optional) depending on your setup, you might need to re-install some
packages, for example:
```
# goimports to automatically import libraries
$ go install golang.org/x/tools/cmd/goimports@latest
# gopls for the language server
$ go install golang.org/x/tools/gopls@latest
# etc...
```
#13527
(Adding @mna to double check the changes in the async implementation of
policy result storage)
This PR also adds the osquery-perf changes needed to define the count of
macOS and Windows hosts.
- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- ~[ ] Documented any API changes (docs/Using-Fleet/REST-API.md or
docs/Contributing/API-for-contributors.md)~
- ~[ ] Documented any permissions changes (docs/Using
Fleet/manage-access.md)~
- [X] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)
- [X] Added support on fleet's osquery simulator `cmd/osquery-perf` for
new osquery data ingestion features.
- [X] Added/updated tests
- [X] Manual QA for all new/changed functionality
- ~For Orbit and Fleet Desktop changes:~
- ~[ ] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.~
- ~[ ] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).~
Test with 80k hosts: 70k simulated macOS, 10k simulated Windows.
Apply Windows policies first, then apply macOS policies:
```
fleetctl apply -f ee/cis/win-10/cis-policy-queries.yml
# Leave running for some time
fleetctl apply -f ee/cis/macos-13/cis-policy-queries.yml
```
After applying CIS policies previous to these changes:

After applying these changes and applying the same policies:

* Add sentry
* Fix gosum
* More gosum fixes
* Add missing def for config
* Enrich sentry scope a bit
* Add changes file
* Add goroutine safe scope to errors
* Encapsulate sentry logic
* Add documentation for new flag
* Add sentry capturing to crons and other background tasks
* Only send to sentry when enabled
* Serialize hosts writes per instance
* Write hosts asynchronously
* Dont make the save in a goroutine
* Revert "Dont make the save in a goroutine"
This reverts commit 4a890c5271.
* Make all savehosts async
* Address review comments and make this approach configurable
* Address review comments
* Disable bulk seen time marking for a test
* Move host seen times to a new table
* Remove unused
* Add seen_time to list hosts
* Add some jitter to seen time flushing
* Remove unused
* Add timeout to deferred save host
* Add tests for serialSaveHost
* Update hosts in labels and policy executions in a serial way
* Address review comments and remove fk constraints in host software
* Make errCh buffered
* Add changes file
* Readd key