For #16795, this:
- Updates Go to go1.22.3
- Per
https://github.com/fleetdm/fleet/issues/16795#issuecomment-2100450618, I
also ran the following to update the versions requested by @getvictor
```
go get github.com/kataras/golog@v0.1.12
go get github.com/kataras/iris/v12@v12.2.11
go get github.com/sethvargo/go-password@v0.3.0
```
**Notes**
After this is merged people will need to update their Go version. I use
gvm and I did it like:
```
$ gvm install go1.22.3
$ gvm use go1.22.3 --default
```
**Relevant changes**
The release notes mention:
> Previously, the variables declared by a “for” loop were created once
> and updated by each iteration. In Go 1.22, each iteration of the loop
> creates new variables, to avoid accidental sharing bugs.
However, we already have a lint rule (see
https://github.com/fleetdm/fleet/pull/13877) for this scenario, so it
shouldn't affect us.
#19218
This fixes the "exit status 78" error seen at orbit startup.
Manually tested by forcing orbit to take the "error" path.
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [ ] Added/updated tests
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [ ] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [ ] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
#18925 (Should also fix #17660.)
Tests:
- Ubuntu 22.04.2
- Wayland
- Works with chrome ✅
- Doesn't work with Firefox. ❌
- Xorg
- Works with Chrome. ✅
- Works with Firefox. ✅
- Ubuntu 24.04
- Wayland
- Doesn't work with Chrome. ❌
- Doesn't work with Firefox. ❌
- Xorg (when using Xorg it defaults to `DISPLAY=:1`, and with the
changes in this PR it works):
- Works with Chrome. ✅
- Works with Firefox. ✅
---
How to change between Wayland and Xorg:
- Set `WaylandEnable=false` in `/etc/gdm3/custom.conf` and reboot.
---
How to determine what's running:
```sh
$ loginctl
SESSION UID USER SEAT TTY
2 1000 luk seat0 tty2
c2 1000 luk
$ loginctl show-session 2 -p Type
# will output
Type=wayland
or
Type=x11
```
---
- [X] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [X] Added/updated tests
- [X] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [X] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [x] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
for https://github.com/fleetdm/fleet/issues/19020
- Fixes the rollback logic to get the right script for the software
being installed
- Fixes the messages displayed in the install results
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Added/updated tests
- [x] Manual QA for all new/changed functionality
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Added/updated tests
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [x] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [x] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
---------
Co-authored-by: Roberto Dip <rroperzh@gmail.com>
New interface for adding periodic jobs that rely on notifications/config
changes in Orbit.
Previously if we wanted to have recurring checks in Orbit, we would add
them into a chain of `GetConfig` calls. This call chain would be run
periodically by one of the runners registered with the cli application
framework.
The new method to register `OrbitConfigReceivers` with the
`OrbitClient`, and then register the orbit client itself with the
application framework.
Instead of having giving each fetcher an internal reference to the
previous fetcher that it must call, the receiver is registered with the
client and the new config is passed to the receiver.
This is the old `GetConfig()` interface:
```go
type OrbitConfigFetcher interface {
GetConfig() (*fleet.OrbitConfig, error)
}
```
This is the new `OrbitConfigReceiver` interface:
```go
type OrbitConfigReceiver interface {
Run(*OrbitConfig) error
}
```
To register a new receiver, you call the `RegisterConfigReceiver` method
on the client.
```go
orbitClient.RegisterConfigReceiver(extRunner)
```
Downsides of the old method:
- Spaghetti call chain setup
- Cascading failure, of one fails, all after it fail
- Run in series, one long function call holds up the rest
- Anything that wants to restart orbit is added as a Runner to the
application, meaning there could be several timers calling `GetConfig`
and running the chain
Benefits of the new method:
- Clean `RegisterConfigReceiver` api, no call chaining required
- Config receivers can be added at runtime
- Isolated receivers, one failing call don't effect others
- All calls are run in parallel in goroutines, no calls can hold up the
rest
- No more need for multiple runners, using a context cancel, any
receiver can queue a call to restart orbit
- Single point to handle errors and logging for all receivers
- Panic recovery to stop orbit from crashing
- Easier to test, configs are passed in and do not require a call chain
This branch contains a little bit of code from the installer method I
was working on because I branched it off of that. (oops)
Not all code comments surrounding old `GetConfig()` methods have been
fully updated yet
Possible changes:
- Update the interface to take a context, so we can let receivers know
to exit early. I can imagine two cases for this:
- The application is about to restart
- We can set a timeout for how long receivers are allowed to take
Closes#12662
---------
Co-authored-by: Martin Angers <martin.n.angers@gmail.com>
Co-authored-by: Roberto Dip <dip.jesusr@gmail.com>
#18783
- [X] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [x] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [x] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
#18808
Added the new `sofa_security_release_info` and `sofa_unpatched_cves`
tables from `macadmins/osquery-extension` 1.0.1
These tables do not have detailed documentation in macadmins repo, so
not adding documentation at this point.
# Checklist for submitter
<!-- Note that API documentation changes are now addressed by the
product design team. -->
- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [x] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [x] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
#17695
The windows exit code is a 32-bit unsigned integer, but the command
interpreter treats it like a signed integer. When a process is killed,
it returns 0xFFFFFFFF (interpreted as -1). We convert the integer to an
signed 32-bit integer to flip it to a -1 to match our expectations, and
fit in our db column.
https://en.wikipedia.org/wiki/Exit_status#Windows
FIxed on both the client and server side.
#17361#17148
In GET fleet/hosts/:id response, added the following fields:
- orbit_version
- `orbit_version == null` means this agent is not an orbit agent
- fleet_desktop_version
- `fleet_desktop_version == null` means this agent is not an orbit agent
or it is an older version which is not collecting the desktop version
- `fleet_desktop_version == ""` means this agent is an orbit agent but
does not have fleet desktop
- scripts_enabled
- `scripts_enabled == null` means this agent is not an orbit agent or it
is an older version which is not collecting scripts_enabled
In orbit_info table, added the following fields:
- desktop_version
- scripts_enabled
Updated docs for orbit_info PR:
https://github.com/fleetdm/fleet/pull/18135
Updated API docs: https://github.com/fleetdm/fleet/pull/17814
MDM lock/unlock/wipe error messages are not part of this PR. They will
be in a separate PR.
# Checklist for submitter
- [x] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Added support on fleet's osquery simulator `cmd/osquery-perf` for
new osquery data ingestion features.
- [x] Added/updated tests
- [x] If database migrations are included, checked table schema to
confirm autoupdate
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [x] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [x] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
We need to support semantic versioning with pre-releases to comply with
semantic versioning required by goreleaser allowing us to build orbit
pre-releases.
For #17577
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
- [x] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [x] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux. (performed only on macOS)
#16594
- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [X] Added/updated tests
- [X] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [X] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [X] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
#17054
This was used as part of the release of fleetd 1.22.0 to the `edge`
channel.
I added more automation to ease releasing fleetd. (They were too many
manual clicks and error prone actions.)
two motivations:
- prevent mysterious crashes in arm64 machines without Rosetta (often
the case in fresh VMs)
- prevent unexpected errors in Windows arm64 VMs when using certain
system calls
# Checklist for submitter
If some of the following don't apply, delete the relevant line.
- [x] Manual QA for all new/changed functionality
#16423, #16326
On the [original PR](https://github.com/fleetdm/fleet/pull/16968) we
missed detecting 5XX errors. Fleet usually runs behind load balancers,
so when bringing Fleet down, orbit connects successfully but gets 5XX
errors, so we need to detect those too.
Orbit changes for #16423.
Should also fix#16326 (in case of network errors).
Orbit will log the following every 5 minutes:
```
2024-02-20T14:27:40-03:00 INF network error error="Post \"https://localhost:8080/api/fleet/orbit/config\": dial tcp [::1]:8080: connect: connection refused"
```
- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [x] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [x] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).
Should tackle #14026.
This will run a daily Github action and create a PR if there's a new
update in our TUF on `edge` or `stable`.
E.g. somebody releases 1.22.0 fleetd to `stable` on our TUF and the next
day this automation runs and will create a PR that updates the versions
in `orbit/TUF.md` (or they can run the workflow manually).
Am happy to amend the shape of `orbit/TUF.md` (or we can iterate later).
The scheduled test run
https://github.com/fleetdm/fleet/actions/runs/7764392848 failed with a
panic because `TestWindowsMDMEnrollmentPrevented` timed out:
```
2024-02-03T05:05:26.3041218Z === RUN TestWindowsMDMEnrollmentPrevented
2024-02-03T05:05:26.3044251Z === RUN TestWindowsMDMEnrollmentPrevented/{RenewEnrollmentProfile:false_RotateDiskEncryptionKey:false_NeedsMDMMigration:false_NeedsProgrammaticWindowsMDMEnrollment:true_WindowsMDMDiscoveryEndpoint:http://example.com/_NeedsProgrammaticWindowsMDMUnenrollment:false_PendingScriptExecutionIDs:[]_EnforceBitLockerEncryption:false}
2024-02-03T05:05:26.3047208Z coverage: 2.5% of statements in github.com/fleetdm/fleet/v4/...
2024-02-03T05:05:26.3047963Z panic: test timed out after 1h0m0s
2024-02-03T05:05:26.3048482Z running tests:
2024-02-03T05:05:26.3049005Z TestWindowsMDMEnrollmentPrevented (59m52s)
2024-02-03T05:05:26.3052172Z TestWindowsMDMEnrollmentPrevented/{RenewEnrollmentProfile:false_RotateDiskEncryptionKey:false_NeedsMDMMigration:false_NeedsProgrammaticWindowsMDMEnrollment:true_WindowsMDMDiscoveryEndpoint:http://example.com/_NeedsProgrammaticWindowsMDMUnenrollment:false_PendingScriptExecutionIDs:[]_EnforceBitLockerEncryption:false} (59m52s)
[...]
2024-02-03T05:05:26.3068624Z goroutine 69 [chan receive]:
2024-02-03T05:05:26.3069997Z github.com/fleetdm/fleet/v4/orbit/pkg/update.TestWindowsMDMEnrollmentPrevented.func2.1({{0xe3ada3, 0x12}, {0x0, 0x0}, {0xe37311, 0xc}})
2024-02-03T05:05:26.3072376Z /home/runner/work/fleet/fleet/orbit/pkg/update/notifications_test.go:295 +0x65
2024-02-03T05:05:26.3074514Z github.com/fleetdm/fleet/v4/orbit/pkg/update.(*windowsMDMEnrollmentConfigFetcher).attemptEnrollment(0xc0000f8cf0, {0x0, 0x0, 0x0, 0x1, {0xe3ada3, 0x12}, 0x0, {0x0, 0x0, ...}, ...})
```
I was able to reproduce locally 1/4th of the times, after putting the
following print statements:
```diff
if cfg.NeedsProgrammaticWindowsMDMEnrollment {
fetcher.execEnrollFn = func(args WindowsMDMEnrollmentArgs) error {
- <-chProceed // will be unblocked only when allowed
+ fmt.Println("fetcher.execEnrollFn A: ", apiCallCount)
+ <-chProceed // will be unblocked only when allowed
+ fmt.Println("fetcher.execEnrollFn B: ", apiCallCount)
apiCallCount++ // no need for sync, single-threaded call of this func is guaranteed by the fetcher's mutex
return apiErr
}
@@ -301,7 +303,9 @@ func TestWindowsMDMEnrollmentPrevented(t *testing.T) {
}
} else {
fetcher.execUnenrollFn = func(args WindowsMDMEnrollmentArgs) error {
- <-chProceed // will be unblocked only when allowed
+ fmt.Println("fetcher.execUnenrollFn A: ", apiCallCount)
+ <-chProceed // will be unblocked only when allowed
+ fmt.Println("fetcher.execUnenrollFn B: ", apiCallCount)
apiCallCount++ // no need for sync, single-threaded call of this func is guaranteed by the fetcher's mutex
return apiErr
}
@@ -317,23 +321,33 @@ func TestWindowsMDMEnrollmentPrevented(t *testing.T) {
started := make(chan struct{})
go func() {
+ fmt.Println("before close started")
close(started)
+ fmt.Println("aftre close started")
// the first call will block in enroll/unenroll func
+ fmt.Println("before inner fetchergetconfig")
cfg, err := fetcher.GetConfig()
+ fmt.Println("after inner fetchergetconfig")
assertResult(cfg, err)
}()
+ fmt.Println("before started")
<-started
+ fmt.Println("after started")
// this call will happen while the first call is blocked in
// enroll/unenrollfn, so it won't call the API (won't be able to lock the
// mutex). However it will still complete successfully without being
// blocked by the other call in progress.
+ fmt.Println("before first fetchergetconfig")
cfg, err := fetcher.GetConfig()
+ fmt.Println("before first fetchergetconfig")
assertResult(cfg, err)
// unblock the first call and wait for it to complete
+ fmt.Println("before close chProceed 1")
close(chProceed)
+ fmt.Println("after close chProceed 2")
time.Sleep(100 * time.Millisecond)
```
This is the output I've got every time the test hung:
```
before started
before close started
aftre close started
after started
before first fetchergetconfig
before inner fetchergetconfig
after inner fetchergetconfig
fetcher.execEnrollFn A: 0
```
And this is the output when the tests passed
```
before started
before close started
aftre close started
before inner fetchergetconfig
fetcher.execUnenrollFn A: 0
after started
before first fetchergetconfig
before first fetchergetconfig
before close chProceed 1
after close chProceed 2
fetcher.execUnenrollFn B: 0
after inner fetchergetconfig
fetcher.execUnenrollFn A: 1
fetcher.execUnenrollFn B: 1
```
Note how the deadlock occurs when `GetConfig` is called first outside of
the goroutine. I added some logic to prevent this, but I'm confident
there must be a better way to accomplish the same. cc: @mna you're the
king of concurrency, do you have any ideas?
#16014
- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
- [x] Manual QA for all new/changed functionality
- For Orbit and Fleet Desktop changes:
- [x] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.
- [x] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).