fleet/cmd/fleetctl/integrationtest
Dan Tsekhanskiy aff440236e
Add cache option for software packages to skip re-downloading unchanged content (#42216)
**Related issue:** 
Ref #34797
Ref #42675 

## Problem

When a software installer spec has no `hash_sha256`, Fleet re-downloads
the package, re-extracts metadata, and re-upserts the DB on every GitOps
run, even if the upstream file hasn't changed. For deployments with 50+
URL-only packages across multiple teams, this wastes bandwidth and
processing time on every run.

## Solution

By default, use etags to avoid unnecessary downloads:

1. First run: Fleet downloads the package normally and stores the
server's ETag header
2. Subsequent runs: Fleet sends a conditional GET with `If-None-Match`.
If the server returns 304 Not Modified, Fleet skips the download,
metadata extraction, S3 upload, and DB upsert entirely

Opt-out with `always_download:true`, meaning packages continue to be
downloaded and re-processed on every run, same as today. No UI changes
needed.

```yaml
url: https://nvidia.gpcloudservice.com/global-protect/getmsi.esp?version=64&platform=windows
always_download: true
install_script:
  path: install.ps1
```

### Why conditional GET instead of HEAD

Fleet team [analysis of 276 maintained
apps](https://github.com/fleetdm/fleet/pull/42216#issuecomment-4105430061)
showed 7 apps where HEAD requests fail (405, 403, timeout) but GET works
for all. Conditional GET eliminates that failure class: if the server
doesn't support conditional requests, it returns 200 with the full body,
same as today.

### Why opt-in

5 of 276 apps (1.8%) have stale ETags (content changes but ETag stays
the same), caused by CDN caching artifacts (CloudFront, Cloudflare,
nginx inode-based ETags). The `cache` key lets users opt in per package
for URLs where they've verified ETag behavior is correct.

Validation rejects `always_download: true` when hash_sha256` is set

## Changes

- New YAML field: `cache` (bool, package-level)
- New migration: `http_etag` VARCHAR(512) column (explicit
`utf8mb4_unicode_ci` collation) + composite index `(global_or_team_id,
url(255))` on `software_installers`
- New datastore method: `GetInstallerByTeamAndURL`
- `downloadURLFn` accepts optional `If-None-Match` header, returns 304
as `(resp, nil, nil)` with `http.NoBody`
- ETag validated per RFC 7232 (ASCII printable only, no control chars,
max 512 bytes) at both write and read time
- Cache skipped for `.ipa` packages (multi-platform extraInstallers)
- TempFileReader and HTTP response leak prevention on download retry
- Docs updated in `yaml-files.md`

## What doesn't change

- Packages with `hash_sha256`: existing hash-based skip, untouched
- FMA packages: FMA version cache, untouched
- Packages with `always_download: true`: identical to current behavior
- Fleet UI: no changes

## Test plan

Automated testing:
- [x] 16 unit tests for `validETag`
- [x] 8 unit tests for conditional GET behavior (304, 200, 403, 500,
weak ETag, S3 multipart, no ETag)
- [x] MySQL integration test for `GetInstallerByTeamAndURL`
- [x] All 23 existing `TestSoftwareInstallers` datastore tests pass
- [x] All existing service tests pass

Manual testing:
- [x] E2E: 86 packages across 6 CDN patterns, second apply shows 51
conditional hits (304)
- [x] @sgress454 used a local fileserver tool to test w/ a new instance
and dummy packages


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* ETag-based conditional downloads to skip unchanged remote installer
files.
  * New always_download flag to force full re-downloads.

* **Tests**
* Added integration and unit tests covering conditional GETs, ETag
validation, retries, edge cases, and payload behavior.

* **Chores**
* Persist HTTP ETag and related metadata; DB migration and index to
speed installer lookups.
* Added installer lookup by team+URL to support conditional download
flow.

* **Bug Fix**
* Rejects using always_download together with an explicit SHA256 in
uploads.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Scott Gress <scott@fleetdm.com>
Co-authored-by: Scott Gress <scott@pigandcow.com>
Co-authored-by: Ian Littman <iansltx@gmail.com>
2026-04-14 13:01:33 -05:00
..
gitops Add cache option for software packages to skip re-downloading unchanged content (#42216) 2026-04-14 13:01:33 -05:00
package Moved some integration tests into their own package. (#28978) 2025-05-09 09:26:57 -05:00
preview Remove obsolete assertions to fix failing tests (#39249) 2026-02-03 13:01:49 -05:00
vuln Moved some integration tests into their own package. (#28978) 2025-05-09 09:26:57 -05:00
suite.go Moved some integration tests into their own package. (#28978) 2025-05-09 09:26:57 -05:00