fleet/ee/cis/macos-14
Dante Catalfamo 9ff63eb52e
macOS 14 CIS benchmark v3.0.0 update (#43797)
**Related issue:** Resolves #35172

Updates the macOS 14 (Sonoma) CIS policy set to benchmark v3.0.0, adds a
`cis_id` field to every policy, fixes several broken test scripts,
introduces an automated test runner, and ships `CIS-BENCHMARKS.md` as a
central guide for authoring and maintaining CIS benchmarks.

## Summary of changes

- `ee/cis/macos-14/cis-policy-queries.yml`: v3.0.0 updates + `cis_id`
added to every entry
- `ee/cis/CIS-BENCHMARKS.md`: new authoring/testing/automation guide for
all macOS CIS benchmarks (and the pattern other OS dirs follow)
- `tools/cis/cis-test-runner.py`: new 2150-line Python runner that
drives end-to-end validation against a real tart VM + Fleet server
- `ee/cis/macos-14/test/scripts/`: 10 new pass/fail script pairs, 8
existing scripts fixed (several were silently broken)

# How the automated testing works

The runner (`tools/cis/cis-test-runner.py`) exercises the full policy
lifecycle against a real macOS VM:

**Phases**

1. **Parse** `cis-policy-queries.yml` and filter by `--all`,
`--cis-ids`, `--match`, and type flags (`--only-scripts`, `--only-mdm`,
`--only-manual`).
2. **Classify** each policy into a test type based on available
artifacts:

   | Priority | Type | Artifacts | Behavior |
   |---|---|---|---|
| 1 | `PASS_FAIL` | `CIS_{id}_pass.sh` + `_fail.sh` | Run fail → verify
query fails → run pass → verify passes |
   | 2 | `PASS_ONLY` | `CIS_{id}.sh` | Run script → verify passes |
| 3 | `PROFILE` | `.mobileconfig` only | Verify query fails before
profile → push profile → verify passes |
| 4 | `ORG_DECISION` | paired `-enable`/`-disable` profiles | Toggle
between variants |
| 5 | `MANUAL` | none | Prompt operator, or skip with `--skip-manual` |

3. **Provision**: create a fresh Fleet team with a unique enroll secret,
build a fleetd pkg bound to it, create+boot a tart VM, install the
agent, and enroll.
4. **MDM**: prompt the operator for MDM enrollment if any tests need it.
Clear team profiles, baseline the VM, push all required profiles in one
batch, wait for delivery.
5. **Execute**: for each plan, SCP the script, run it over SSH, then run
the policy via `fleetctl query --hosts <hostname>`. A query that returns
rows = pass.
6. **Report** summary with PASS/FAIL/SKIP/ERROR counts.
7. **Cleanup** (with `--cleanup`) deletes the team, host record, and VM.

**Special-case handling** (keyed by OS version because CIS IDs aren't
stable across releases):

- `SSH_BREAKING_CIS_IDS`: tests that disable sshd (2.3.3.4, 2.3.3.5) are
forced to `MANUAL` so the runner doesn't lock itself out.
- `PASSWORD_POLICY_CIS_IDS`: 5.2.x profiles invalidate the VM's
`admin`/`admin` login — forced to `MANUAL`.
- `NON_AUTOMATABLE_CIS_IDS`: tests that can't run reliably in a VM
(Location Services, Touch ID, shared Siri profile state) forced to
`MANUAL` with a per-entry reason.
- `--keep-vm`: reuses the VM across runs, skipping agent
install/enrollment if the host is already in Fleet. Falls back to fresh
creation if SSH is unreachable.

**Credential resolution order**: CLI flag →
`FLEET_URL`/`FLEET_API_TOKEN` env → `~/.fleet/config` (from `fleetctl
login`).

## How to use `CIS-BENCHMARKS.md` going forward

The doc is the single reference for authoring and maintaining CIS
benchmark policies across all macOS (and Windows) versions. For each new
benchmark release, the workflow is:

1. **Read "Updating benchmarks when a new CIS version is released"** —
directs you to the PDF's *Appendix: Change History* to enumerate
Added/Modified/Removed recommendations.
2. **Use the field reference and query patterns** to write or update
policies: direct table check, `managed_policies` EXISTS/NOT EXISTS, or
plist negation check. Name qualifiers `(MDM Required)` / `(Fleetd
Required)` / `(FDA Required)` are documented.
3. **Create matching test artifacts** — pass/fail scripts for togglable
settings, `.mobileconfig` profiles for MDM-only settings. Script
conventions (full paths, sudo pattern, `not_always_working_` prefix) are
standardized.
4. **Update the per-OS README** with limitations, org-decision policies,
and optional policies.
5. **Run the test runner** to validate.

The doc also contains an **end-to-end AI agent prompt** (section at the
bottom) designed to be handed a new CIS PDF plus the previous version's
PDF, to automatically generate the diff, write policies, produce test
artifacts, update docs, and run validation. This lets future benchmark
updates start from a consistent, repeatable baseline rather than being
hand-authored from scratch.

## Query changes

All entries in `cis-policy-queries.yml` received a `cis_id` field so the
runner (and humans) can map policies → scripts → profiles → the
benchmark document without parsing the display name.

| CIS ID | Change |
|---|---|
| 1.1 | Renamed "Ensure All Apple-provided Software Is Current" →
"Ensure Apple-provided Software Updates Are Installed"; added terminal
remediation to `resolution` |
| 1.6 | Expanded description with v3.0.0 language about rapid security
response updates |
| 1.x (deferment) | **Removed** — "Ensure Software Update Deferment Is
Less Than or Equal to 30 Days" dropped from v3.0.0 |
| 2.3.1.1 | Renamed "Ensure AirDrop Is Disabled" → "Ensure AirDrop Is
Disabled When Not Actively Transferring Files"; expanded description |
| 2.3.3.1 (DVD/CD Sharing) | **Removed** — dropped from v3.0.0 |
| 2.3.3.4 (Remote Login) | Query now checks BOTH `disabled.plist` AND
that `com.openssh.sshd` is not in the `launchd` table; resolution
updated to terminal method |
| 5.1.7 | Query rewritten: sticky-bit dirs now properly excluded
(first-char-of-mode check instead of bit-AND on full mode string);
SIP-protected dirs excluded via `com.apple.rootless` xattr check |
| 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.7 | Dropped `username = ''` filter —
Safari profiles deliver at user scope, so the system-scope filter
guaranteed zero rows |
| 6.3.3 | Fixed `NOT EXISTS` domain typo: `com.apple.loginwindow` →
`com.apple.Safari` (the check was previously meaningless) |
| Wi-Fi/Bluetooth menu bar | **Removed** — "Show Wi-Fi status in Menu
Bar" and "Show Bluetooth status in Menu Bar" dropped from v3.0.0 |
| Show All Filename Extensions | **Removed** — dropped from v3.0.0 |

## Script changes

### New scripts

| Script | Purpose |
|---|---|
| `CIS_1.1_pass.sh` / `_fail.sh` | Install updates (pass); clear
`LastFullSuccessfulDate` (fail — caveat: only works when real updates
are pending) |
| `CIS_1.6_pass.sh` / `_fail.sh` | Open/remove the `1.6.mobileconfig`
profile |
| `CIS_2.3.1.1_pass.sh` / `_fail.sh` | Open/remove the AirDrop profile |
| `CIS_2.3.2.2_pass.sh` / `_fail.sh` | `launchctl load/unload -w` of
`com.apple.timed.plist` |
| `CIS_2.3.3.4_pass.sh` / `_fail.sh` | `systemsetup -setremotelogin
off/on` (runs as MANUAL via the runner's SSH-breaking safeguard) |

### Existing scripts fixed

| Script | Bug | Fix |
|---|---|---|
| `CIS_2.3.3.1.sh` | Disabled `com.apple.ODSAgent` (DVD/CD sharing), not
Screen Sharing | Now disables `com.apple.screensharing` |
| `CIS_2.9.2.sh` | `pmset -a womp 0` sets Wake-on-Network, not Power Nap
| Now `pmset -a/-b/-c powernap 0` |
| `CIS_3.2.sh` | `sed` pipeline into root-owned file via user
redirection silently failed; did nothing if the flags line was missing |
`awk` with `tee`/`mv`; appends a flags line when absent; enforces 0400
root:wheel |
| `CIS_3.3.sh` | Only stripped `all_max=`; never added `ttl=365` when
missing, so the query could never pass from a fresh system | `awk` now
both strips `all_max=` and inserts/updates `ttl=365` on the `* file`
line |
| `CIS_3.4.sh` | `sudo sed … > /etc/security/audit_control` — redirect
runs as caller, not root; write silently failed | Rewrites via
`tee`/`mv` with proper perms; appends when line is absent |
| `CIS_3.5.sh` | `chmod -R o-rw` doesn't produce the exact `0400`/`0440`
modes the query requires | Explicit `chmod 0400` on `audit_control`,
`find … -exec chmod 0440 {}` under `/var/audit/` |
| `CIS_5.1.7.sh` | `sudo IFS=$'\n'` runs IFS in a subshell that exits
immediately; searched `/System/Volumes/Data/Library` but the query looks
at `/Library/%` | IFS set in parent shell; searches `/Library`; skips
SIP-protected dirs via xattr |
| `CIS_5.7.sh` | Wrote `use-login-window-ui` which the query doesn't
accept | Writes `authenticate-session-owner` |
| `CIS_6.3.6.sh` | Contained literal `<username>` placeholders that were
never substituted | Iterates non-system users (`UniqueID >= 500`) and
runs `defaults write` as each |


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* macOS 14 (Sonoma) CIS policies updated to v3.0.0 with refreshed policy
names and CIS IDs.
  * New CLI test runner to automate CIS validation against macOS VMs.

* **Bug Fixes / Improvements**
* Updated remediations and audit/query logic; safer, atomic config
updates; several policies revised or removed.

* **Tests**
* Many new and improved pass/fail helper scripts for validating CIS
checks and profiles.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-23 12:56:38 -04:00
..
test macOS 14 CIS benchmark v3.0.0 update (#43797) 2026-04-23 12:56:38 -04:00
cis-policy-queries.yml macOS 14 CIS benchmark v3.0.0 update (#43797) 2026-04-23 12:56:38 -04:00
README.md "Teams" => "fleets", "queries" => "reports" doc changes (#39585) 2026-03-11 23:41:14 -05:00

macOS 14 Sonoma benchmark

Fleet's policies have been written against v2.1.0 of the benchmark. You can refer to the CIS website for full details about this version.

For requirements and usage details, see the CIS Benchmarks documentation.

Limitations

The following CIS benchmarks cannot be checked with a policy in Fleet:

  1. 2.1.2 Audit App Store Password Settings
  2. 2.3.3.12 Ensure Computer Name Does Not Contain PII or Protected Organizational Information
  3. 2.6.6 Audit Lockdown Mode
  4. 2.11.2 Audit Touch ID and Wallet & Apple Pay Settings
  5. 2.13.1 Audit Passwords System Preference Setting
  6. 2.14.1 Audit Notification & Focus Settings
  7. 3.7 Audit Software Inventory
  8. 6.2.1 Ensure Protect Mail Activity in Mail Is Enabled

Checks that require decision

CIS has left the parameters of the following checks up to the benchmark implementer. CIS recommends that an organization make a conscious decision for these benchmarks, but does not make a specific recommendation.

Fleet has provided both an "enabled" and "disabled" version of these benchmarks. When both policies are added, at least one will fail. Once your organization has made a decision, you can delete one or the other policy. The policy will be appended with a -enabled or -disabled label, such as 2.1.1.1-enabled.

  • 2.1.1.1 Audit iCloud Keychain
  • 2.1.1.2 Audit iCloud Drive
  • 2.5.1 Audit Siri
  • 2.8.1 Audit Universal Control

Furthermore, CIS has decided to not require the following password complexity settings:

  • 5.2.3 Ensure Complex Password Must Contain Alphabetic Characters Is Configured
  • 5.2.4 Ensure Complex Password Must Contain Numeric Character Is Configured
  • 5.2.5 Ensure Complex Password Must Contain Special Character Is Configured
  • 5.2.6 Ensure Complex Password Must Contain Uppercase and Lowercase Characters Is Configured

However, Fleet has provided these as policies. If your organization declines to implement these, simply delete the corresponding policies.