mirror of https://github.com/fleetdm/fleet synced 2026-04-21 21:47:20 +00:00

Docs quick reference optimization (#21331 )

This PR closes https://github.com/fleetdm/fleet/issues/21108

@noahtalerman, I double-checked all redirects, and they are working.
Clicking through the URLs in [this
spreadsheet](https://docs.google.com/spreadsheets/d/1djVynIMuJK4pT5ziJW12CluVqcaoxxnCLaBO3VXfAt4/edit?usp=sharing)
is a pretty quick way to go through them all. Note that "Audit logs" and
"Understanding host vitals" redirect to the contributor docs on GitHub,
so they will throw a 404 until this is merged.

Some new guides benefitted from a name change, so they make more sense
as stand-alone guides, and also so that we don't have to mess around
with more redirects later. Those name changes followed [this
convention](https://fleetdm.com/handbook/company/communications#headings-and-titles),
which was recently documented in the handbook.

Have fun!

---------

Co-authored-by: Eric <eashaw@sailsjs.com>
Co-authored-by: Noah Talerman <noahtal@umich.edu>

2024-08-16 15:30:31 -05:00

2.8 KiB

Raw Blame History

Osquery watchdog

Osquery will run a watcher process to keep track of any child process and any managed extensions. What follows is a description of what happens during the watcher REPL and under what circumstances the child process and/or managed extensions are terminated.

As a first step, the watcher checks the state of the child worker process, which could be either Alive or Non-existent. If the process is Alive, we make sure the process is within its assigned resource quota, by checking:

That the maximum CPU utilization limit is not exceeded (which is controlled by osquery's --watchdog_latency_limit flag).
The maximum memory limit is not exceeded (which is controlled by osquery's --watchdog_memory_limit flag).

If the child process is within the resource limits, then it is deemed alive and well. Otherwise, we terminate the process by following these steps:

We send a SIGUSR1 to the child process.
We send a SIGTERM to the child process.
After a delay (configured by osquery's --watchdog_forced_shutdown_delay flag) we send a SIGKILL to the child process.

If the child process is Non-existent, either because it didn't exist in the first place or because it was terminated, the watcher will try to spawn a new child process. But first, it will check whether the maximum number of allowed process re-spawns was reached. If it was, then the osquery process shutdowns.

After checking the state of the child worker, we check the state of every managed extension, which could be Alive or Non-existent.

If the managed extension is Alive, the watcher will check both the CPU utilization and memory consumption (the same checks we perform for the child process). If the managed extension is deemed unstable, we terminate the extension by following these steps:

We send a SIGTERM to the managed extension.
After a delay (configured by osquery's --watchdog_forced_shutdown_delay flag), we send a SIGKILL to the managed extension.

If the managed extension is Non-existent (either because it was Non-existent in the first place or because it was terminated due to resource contention), the watcher will try to 'launch' the managed extension. But first, it will check the respawn limit. If the respawn limit was reached or if for some reason the extension could be spawned, then the osquery process is shut down.

Lastly, we check the state of the watcher process itself. If it is deemed unhealthy because of resource contention, then the osquery process is shut down.

2.8 KiB Raw Blame History

Osquery watchdog

2.8 KiB

Raw Blame History