5 KiB
Software library for osquery-perf
This directory contains the software database and tools used by osquery-perf for load testing.
Quick start
Initial setup
- Create the database:
sqlite3 software.db < software.sql
- Verify the database (optional):
sqlite3 software.db "SELECT COUNT(*) FROM software;"
sqlite3 software.db "SELECT source, COUNT(*) FROM software GROUP BY source;"
# Shows distribution across sources
Running osquery-perf
Once the database exists, osquery-perf will automatically use it:
cd ../..
./osquery-perf --host-count 1000
Each simulated host will get random platform-specific software from the database.
Directory structure
software-library/
├── README.md # This file
├── software.db # SQLite database (created from software.sql)
├── software.sql # SQL dump with schema + data (source of truth)
├── tools/ # Import and maintenance tools
│ ├── import-data/ # Import server data from CSV
│ └── generate-sql/ # Generate software.sql from database
└── source-data/ # Source CSV files (all gitignored)
└── .gitignore
Tools
import-data
Imports software data from CSV files, validates entries, and optionally filters out internal/proprietary software.
Usage:
cd tools/import-data
# Import CSV file (no filtering)
go run . --input ../../source-data/server_export.csv
# Import with pattern filtering
go run . --input ../../source-data/server_export.csv --filter "numa-internal,numa-,corp-"
# Import with vendor filtering
go run . --input ../../source-data/server_export.csv --filter-vendor "numa"
# Dry run (validate without importing)
go run . --input ../../source-data/server_export.csv --dry-run
# Verbose output
go run . --input ../../source-data/server_export.csv --verbose
What it does:
- Reads software entries from CSV files
- Optional filtering (disabled by default):
--filter: Filter names containing specified patterns (comma-separated)--filter-vendor: Filter software from specified vendor (except well-known public software)
generate-sql
Generates software.sql file from the populated database.
Usage:
cd tools/generate-sql
# Generate software.sql
go run .
# Specify custom paths
go run . --db ../../software.db --output ../../software.sql
# Verbose output (shows progress)
go run . --verbose
What it does:
- Reads all data from
software.db - Generates SQL INSERT statements
- Includes schema definition
- Creates reproducible SQL dump
Database setup workflow
Here's the typical workflow:
Step 1: Initialize database from software.sql
sqlite3 software.db < software.sql
This creates the database with schema and initial data.
Step 2: Export server data
Export software from Fleet's MySQL database to CSV:
mysql -h <host> -u <user> -p <database> --batch --raw -e "
SELECT
'name', 'version', 'source', 'bundle_identifier', 'vendor', 'arch', 'release', 'extension_id', 'extension_for', 'application_id', 'upgrade_code'
UNION ALL
SELECT
IFNULL(name, ''),
IFNULL(version, ''),
IFNULL(source, ''),
IFNULL(bundle_identifier, ''),
IFNULL(vendor, ''),
IFNULL(arch, ''),
IFNULL(\`release\`, ''),
IFNULL(extension_id, ''),
IFNULL(extension_for, ''),
IFNULL(application_id, ''),
IFNULL(upgrade_code, '')
FROM software
" 2>&1 | sed 's/\t/","/g' | sed 's/^/"/' | sed 's/$/"/' | tail -n +2 > source-data/server_export.csv
Note: This command properly quotes CSV fields to handle commas in values (e.g., "Red Hat, Inc."). The tail -n +2 removes the MySQL password warning message while preserving the header row.
This creates a CSV with the following columns:
name,version,source- Required fieldsbundle_identifier- macOS bundle IDvendor- Software vendorarch- Architecture (x86_64, arm64, etc.)release- Release infoextension_id- Browser/IDE extension IDextension_for- Host software for extensions (Chrome, Firefox, VS Code, etc.)application_id- Android application IDupgrade_code- Windows upgrade GUID
Optional filtering:
- Add
WHEREclause to filter by date, fleet, or other criteria - Example:
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
Step 3: Import server data
cd tools/import-data
# Import with filtering for internal software
go run . --input ../../source-data/server_export.csv \
--filter "numa-internal,numa-,corp-,internal-" \
--filter-vendor "numa" \
--verbose
This imports and validates server data, optionally filtering out internal software.
Step 4: Generate software.sql
cd ../generate-sql
# Generate SQL dump
go run . --verbose
This creates software.sql that can recreate the entire database.
Step 5: Verify
# Check counts by source
sqlite3 software.db "
SELECT
source,
COUNT(*) as count
FROM software
GROUP BY source
ORDER BY count DESC
"