For Fleet Premium users, each CVE includes its Common Vulnerability Scoring System (CVSS) base score (reported by the [National Vulnerability Database](https://nvd.nist.gov/)), probability of exploit (reported by [FIRST](https://www.first.org/epss/)), and whether or not there is a known exploit in the wild (reported by the [Cybersecurity & Infrastructure Security Agency](https://www.cisa.gov/known-exploited-vulnerabilities-catalog)).
Fleet's strategy for detecting vulnerabilities (CVEs) varies according to the host's platform and
- Fleet uses the [National Vulnerability Database CPE Dictionary](https://nvd.nist.gov/products/cpe) to get CPE information (this information maps software names/titles to CPEs)
- Fleet combines two sources to get accurate and up-to-date CVE information:
- To reduce the load and complexity of processing these datasets, Fleet uses two Github repositories (https://github.com/fleetdm/nvd and https://github.com/fleetdm/vulnerabilities) that fetch, pre-process and expose the resulting dataset as Github releases.
- The Fleet servers then download these Github releases and run vulnerability processing using the downloaded datasets and the software information fetched from hosts.
```mermaid
sequenceDiagram
participant fleet_server as Fleet server
participant vulnerabilities_repo as github.com/fleetdm/vulnerabilities
participant nvd_repo as github.com/fleetdm/nvd
participant nvd_site as National Vulnerability Database
participant vulncheck as VulnCheck
alt Github action every 24h
nvd_repo->>nvd_site: Download CPE dictionary
nvd_site-->>nvd_repo: ;
note over nvd_repo: Generate and<br>release cpe.sqlite
end
alt Github action every 30m
vulnerabilities_repo->>nvd_site: Download CVE feed<br>using API 2.0
Vulnerability processing is performed in one Fleet instance. If your Fleet deployment uses multiple
instances, only one will be doing the work.
In order to conduct vulnerability processing, Fleet downloads the following files:
1. A preprocessed CPE database generated by FleetDM to speed up the translation process: <https://github.com/fleetdm/nvd/releases>
2. The historical data for all CVEs and how to match to a CPE: from
<https://nvd.nist.gov/vuln/data-feeds>
The database generated in step 1 is processed from the original official CPE dictionary
<https://nvd.nist.gov/products/cpe>. This CPE dictionary is typically updated once a day.
The matching occurs server-side to make the processing as fast as possible, but the whole process is both CPU and memory intensive.
For example, when running a development instance of Fleet on an Apple Macbook Pro with 16 cores, matching 200,000 CPEs against the CVE database will take around 10 seconds and consume about 3GBs of RAM.
The CPU and memory usages are in burst once every hour (or the configured periodicity) on the
instance that does the processing. RAM spikes are expected to not exceed the 2GBs.
There are several steps that go into the vulnerability detection process. In this section we'll dive into what they are and how it works.
The process has different parts that are more error-prone than others. Each OS and each application developer and maintainer can (and do) have their own way of defining each part of their app. Some Linux distributions are very strict, but each distribution handles things differently.
The whole pipeline exists to compensate for these differences, and it can be divided in two sections:
1. Collection:
```mermaid
graph TD;
host1[Host1 send software list]-->normalize[Normalization of names, versions, etc]
host2[Host2 send software list]-->normalize
host3[Host3 send software list]-->normalize
normalize-->store[Storage for later processing]
```
2. Processing
Processing happens in a loop and varies depending on the platform - first Windows/Mac OS hosts
will be processed, then we look at Linux hosts. The default interval is 1hr.
### General process
```mermaid
graph TD;
interval{Once an hour}-->normalize[Normalized software list]
normalize-->process1[Process Windows/Mac OS hosts]
This is the first step into normalizing data across platforms, as we try to get all the same data for all different types of software we detect vulnerabilities on.
Ingestion can be resource hungry, both on the hosts and the Fleet server. A lot of work has gone into reducing the resources needed, and it's still ongoing.
### Translating to CPE
With a somewhat normalized list of software, in order to search CVEs for it, we need to derive a [CPE](https://en.wikipedia.org/wiki/Common_Platform_Enumeration) from the vendor, name, version, and OS.
As described briefly above, we do this by translating the NVD database of CPEs into a [sqlite database that helps Fleet do the lookup of CPEs very quickly](https://github.com/fleetdm/nvd).
NOTE: Software that was ingested with an empty `version` field will be ignored by the NVD vulnerability processing.
#### How accurate is this translation process?
This is the most error prone part of the process.
The CPE can have some vagueness.
This means that parts of it can be a `*`, which means when you match that CPE to a CVE it can match any of that part of the CPE.
If the CPE is too vague, the extreme case being all parts are `*`, all CVEs will match. You want a very specific CPE, but not too specific that a small error would make it not match a CVE (false negative).
Let's look into some examples of this stage.
##### Example: tmux
tmux is a Unix terminal utility to multiplex ttys. It appears listed like this in macOS:
```text
osquery> SELECT * FROM homebrew_packages WHERE name='tmux';
+------+----------------------------+---------+
| name | path | version |
+------+----------------------------+---------+
| tmux | /opt/homebrew/Cellar/tmux/ | 3.2a |
+------+----------------------------+---------+
```
If we look at the [official releases](https://github.com/tmux/tmux/releases/tag/3.2a) the version we get is the same as the one listed. This means that it'll be easy to map it to a CPE that will accurately represent the software.
Now let's look at Chrome on macOS:
```text
osquery> select name, bundle_version from apps where name like '%Chrome%';
+-------------------+----------------+
| name | bundle_version |
+-------------------+----------------+
| Google Chrome.app | 4758.102 |
+-------------------+----------------+
```
Now things start to get slightly more tricky. We have to remove the `.app` suffix from the name, then derive the first word as the vendor and the second as the app name. We could use `bundle_name` for the app name, but nothing stops the app developer of adding the vendor to `bundle_name`, so a similar parsing would have to happen.
These are two illustrative examples. The reality is that there is no map or list of all the software available and how it's presented in each platform, so the "software to CPE" translation process is going to be evolving constantly.
#### Improving accuracy
In order to improve the accuracy of matching software to CPEs, CPE translations rules are added for known cases where matching fails.
`server/vulnerabilities/cpe_translations.json` contains these rules and is included in the [NVD release](https://github.com/fleetdm/nvd/releases/latest).
##### Example: `ruby@2.7` installed via `homebrew`
The following CPE translation rule is used to reduce false positives when ruby is installed via homebrew.
This is needed because ruby is commonly included in the title in the CPE database.
This rule matches the software name `ruby` matching a regular expression pattern and installed on
When searching for [CPEs](https://en.wikipedia.org/wiki/Common_Platform_Enumeration), the specified `product` and `vendor` will be added to the filter criteria.
| `software` | array[CPE Translation Software] | The CPE translation software match criteria. |
| `translation` | array[CPE Translation] | The CPE translation. |
##### CPE Translation Software (object)
The CPE translation software match criteria. Used to match software collected from hosts. Fields are are AND'd together. Values inside each field are OR'd together.
| `name` | array[string] | The software name to match. Enclose within `/` to specify a regular expression pattern. |
| `bundle_identifier` | array[string] | The software bundle identifier (MacOS apps only) to match. Enclose within `/` to specify a regular expression pattern. |
| `source` | array[string] | The software source to match. Enclose within `/` to specify a regular expression pattern. |
False positive entries are removed during vulnerability processing if the span of time since the entry was updated is greater than 2x the [configured periodicity](https://fleetdm.com/docs/configuration/fleet-server-configuration#periodicity).