fleet/cmd
Dan Tsekhanskiy aff440236e
Add cache option for software packages to skip re-downloading unchanged content (#42216)
**Related issue:** 
Ref #34797
Ref #42675 

## Problem

When a software installer spec has no `hash_sha256`, Fleet re-downloads
the package, re-extracts metadata, and re-upserts the DB on every GitOps
run, even if the upstream file hasn't changed. For deployments with 50+
URL-only packages across multiple teams, this wastes bandwidth and
processing time on every run.

## Solution

By default, use etags to avoid unnecessary downloads:

1. First run: Fleet downloads the package normally and stores the
server's ETag header
2. Subsequent runs: Fleet sends a conditional GET with `If-None-Match`.
If the server returns 304 Not Modified, Fleet skips the download,
metadata extraction, S3 upload, and DB upsert entirely

Opt-out with `always_download:true`, meaning packages continue to be
downloaded and re-processed on every run, same as today. No UI changes
needed.

```yaml
url: https://nvidia.gpcloudservice.com/global-protect/getmsi.esp?version=64&platform=windows
always_download: true
install_script:
  path: install.ps1
```

### Why conditional GET instead of HEAD

Fleet team [analysis of 276 maintained
apps](https://github.com/fleetdm/fleet/pull/42216#issuecomment-4105430061)
showed 7 apps where HEAD requests fail (405, 403, timeout) but GET works
for all. Conditional GET eliminates that failure class: if the server
doesn't support conditional requests, it returns 200 with the full body,
same as today.

### Why opt-in

5 of 276 apps (1.8%) have stale ETags (content changes but ETag stays
the same), caused by CDN caching artifacts (CloudFront, Cloudflare,
nginx inode-based ETags). The `cache` key lets users opt in per package
for URLs where they've verified ETag behavior is correct.

Validation rejects `always_download: true` when hash_sha256` is set

## Changes

- New YAML field: `cache` (bool, package-level)
- New migration: `http_etag` VARCHAR(512) column (explicit
`utf8mb4_unicode_ci` collation) + composite index `(global_or_team_id,
url(255))` on `software_installers`
- New datastore method: `GetInstallerByTeamAndURL`
- `downloadURLFn` accepts optional `If-None-Match` header, returns 304
as `(resp, nil, nil)` with `http.NoBody`
- ETag validated per RFC 7232 (ASCII printable only, no control chars,
max 512 bytes) at both write and read time
- Cache skipped for `.ipa` packages (multi-platform extraInstallers)
- TempFileReader and HTTP response leak prevention on download retry
- Docs updated in `yaml-files.md`

## What doesn't change

- Packages with `hash_sha256`: existing hash-based skip, untouched
- FMA packages: FMA version cache, untouched
- Packages with `always_download: true`: identical to current behavior
- Fleet UI: no changes

## Test plan

Automated testing:
- [x] 16 unit tests for `validETag`
- [x] 8 unit tests for conditional GET behavior (304, 200, 403, 500,
weak ETag, S3 multipart, no ETag)
- [x] MySQL integration test for `GetInstallerByTeamAndURL`
- [x] All 23 existing `TestSoftwareInstallers` datastore tests pass
- [x] All existing service tests pass

Manual testing:
- [x] E2E: 86 packages across 6 CDN patterns, second apply shows 51
conditional hits (304)
- [x] @sgress454 used a local fileserver tool to test w/ a new instance
and dummy packages


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* ETag-based conditional downloads to skip unchanged remote installer
files.
  * New always_download flag to force full re-downloads.

* **Tests**
* Added integration and unit tests covering conditional GETs, ETag
validation, retries, edge cases, and payload behavior.

* **Chores**
* Persist HTTP ETag and related metadata; DB migration and index to
speed installer lookups.
* Added installer lookup by team+URL to support conditional download
flow.

* **Bug Fix**
* Rejects using always_download together with an explicit SHA256 in
uploads.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Scott Gress <scott@fleetdm.com>
Co-authored-by: Scott Gress <scott@pigandcow.com>
Co-authored-by: Ian Littman <iansltx@gmail.com>
2026-04-14 13:01:33 -05:00
..
cpe Add sw_edition to cpe db generation and cpe translations (#32879) 2025-09-17 11:30:49 -04:00
cve Reapply "Update Citrix Workspace CPE generation to distinguish betwee… (#41614) 2026-03-12 16:17:40 -07:00
fleet Clean up setup experience cancellation behavior (#43437) 2026-04-14 09:39:26 -05:00
fleetctl Add cache option for software packages to skip re-downloading unchanged content (#42216) 2026-04-14 13:01:33 -05:00
gitops-migrate Add back gitops-migrate file (#33981) 2025-10-08 09:44:59 -05:00
macoffice Add new archive URL as data source for Mac Office release notes (#26978) 2025-03-10 08:46:18 -05:00
maintained-apps Cleanup temp installer files after download (#42463) 2026-03-30 10:14:36 -05:00
msrc Fix CI: extend grace periods for MSRC feeds and expand test coverage for file validation. (#37991) 2026-01-07 10:28:20 -06:00
osquery-perf Add Windows Program Files scan for software without registry entries (#42992) 2026-04-11 13:42:50 -06:00
osv-processor Use OSV for ubuntu vulnerability scanning (#42063) 2026-04-03 15:59:32 -05:00
winoffice Add Windows Office bulletin generator (1/3) (#42663) 2026-04-01 12:08:50 -06:00