Commit graph

14 commits

Author SHA1 Message Date
Robert Fairburn
dac2ef18f0
Ensure terraform docker compatibility with github actions (#39988)
Co-authored-by: Jorge Falcon <22119513+BCTBB@users.noreply.github.com>
2026-02-17 15:09:50 -05:00
Robert Fairburn
9f60dadae0
Allow gzip responses (#39700) 2026-02-12 10:24:49 -06:00
Jorge Falcon
502351dcde
Add FLEET_MYSQL_READ_REPLICA_TLS_CONFIG environment variable to dogfood and loadtesting (#39692)
- Adds `FLEET_MYSQL_READ_REPLICA_TLS_CONFIG = "custom"` to dogfood and
loadtesting environments.
2026-02-11 13:05:11 -05:00
Jorge Falcon
7ac24d8752
Loadtest (new) - MDM Updates (#37420)
- Adds `FLEET_DEV_MDM_APPLE_DISABLE_PUSH = 1`
- Adds `FLEET_DEV_MDM_APPLE_DISABLE_DEVICE_INFO_CERT_VERIFY = 1`
- Updates osquery_perf/README.md, providing an example fetching and
using mdm scep challenge secret.
2025-12-17 17:55:13 -05:00
Jorge Falcon
fb5c90ad9c
Dogfood and loadtest module updates (#35990)
Updates `module.main` to `1.18.3` (dogfood)
- Adds `memory_tracking_target_value = 70` (dogfood and loadtesting)
- Adds `cpu_tracking_target_value = 70`  (dogfood and loadtesting)

Updates `module.migrations` to `2.2.1` (dogfood)
- Adds `max_capacity`

Updates `module.logging_alb` to `1.6.2` (dogfood)

Updates `module.monitoring` to `1.8.0` (dogfood)
- Adds `log_monitoring` configuration
2025-11-19 22:04:17 -05:00
George Karr
ca5d02d471
Adding changes for Fleet v4.76.1 (#35760) 2025-11-18 14:35:31 -06:00
Jorge Falcon
776cd67647
Loadtest - Firehose logging removal, adds filesystem logging, and module updates (#35735)
- Removes `firehose` logging from loadtesting environment
- Sets `filesystem` logging in loadtesting environment
- Updates fleet image to 4.76.0 as the default value
- Updates `migrations` and `logging_alb` modules with latest versions
2025-11-13 19:16:00 -05:00
Jorge Falcon
e2085bfd86
Loadtesting documentation - Removes (Coming Soon) from README (#35649)
- Removes `(Coming Soon)` from
`infrastructure/loadtesting/infra/README.md` with regards to deployment
via Github Actions
- Moves Signoz steps to `.header.md` to preserve steps in generated
`README.md`
2025-11-12 16:54:14 -05:00
Victor Lyuboslavsky
73501e5755
Infra changes after latest loadtest (#35083)
<!-- Add the related story/sub-task/bug number, like Resolves #123, or
remove if NA -->
**Related issue:** Resolves #34500

Terraform changes after my latest loadtest.

VPC consolidation: updated (and deployed) shared VPC so that Signoz
backend can now use it

  - Removed eks-vpc/ directory
  - Moved VPC management to shared/vpc.tf
  - Updated shared/init.tf to reflect VPC changes

  Infra improvements

- infra/internal_alb.tf - changed suffix from -internal to -int since I
hit max 32 characters issue

OTEL

- OTEL Collector configuration overrides for production stability
2025-11-03 11:02:15 -06:00
Jorge Falcon
6ea9185c1c
Loadtesting - osquery_perf docker image build fixes (#34901)
- Bumps docker provider from 2.16.0 to 3.6.x
- Moves builds from `docker_registry_image` to new `docker_image`
resource
2025-10-29 08:33:46 -04:00
Robert Fairburn
30c4798ec6
Switch git providers for loadtesting tf (#34180)
untested end-to-end but works as a replacement for plans and doesn't
require a local arm64 build to work.

Co-authored-by: Jorge Falcon <22119513+BCTBB@users.noreply.github.com>
2025-10-23 14:53:13 -04:00
Victor Lyuboslavsky
e4e3c3f9ff
Fix issues with OTEL SigNoz deployments for loadtests (#34694)
SigNoz converted from child module to standalone root module with
independent state.

  **Critical Impact**

  Deployment order is now required:
  1. Deploy infrastructure/loadtesting/terraform/signoz/ FIRST
  2. Then deploy infrastructure/loadtesting/terraform/infra/

  Communication between modules via Terraform remote state.

  **Key Configuration Changes**

  - SigNoz creates its own EKS cluster: signoz-${workspace}
- Instance type: t3.xlarge (upgraded from t3.large for resource
headroom)
  - ClickHouse disk: 200Gi (was 20Gi) with 2-day retention
  - Resource limits configured to prevent OOMKills during loadtest
  - wait_for_jobs = false to avoid Helm deployment deadlock


<!-- Add the related story/sub-task/bug number, like Resolves #123, or
remove if NA -->
**Related issue:** Resolves #32331
2025-10-23 12:49:36 -05:00
Victor Lyuboslavsky
aef9b8400c
Added terraform files for Signoz OTEL backend. (#34058)
<!-- Add the related story/sub-task/bug number, like Resolves #123, or
remove if NA -->
**Related issue:** Resolves #32331 

This PR allows us to run loadtest with SigNoz OTEL backend by adding
`-var=enable_otel=true`
SigNoz is deployed via Helm chart.

Enhancements needed (in future PR):
- put SigNoz UI behind VPN
- combine the new eks-vpc with shared fleet-vpc
- make SigNoz shared, so multiple loadtests use the same instance? (But
what about updating to it to latest version?)

Next steps:
- Enable SigNoz in Dogfood environment
- SigNoz by default [keeps 15 days of logs and
traces](https://signoz.io/docs/userguide/retention-period), which is
quite a bit. How much would that cost us and should we reduce it?

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- New Features
- Optional OpenTelemetry tracing with SigNoz via a new enable_otel flag.
- Conditional deployment of a SigNoz stack (managed EKS, storage,
Helm-based apps) with internal OTLP collector endpoint.
- New outputs to retrieve OTLP endpoint, cluster name, and a kubectl
configuration command.

- Documentation
  - Added guidance for deploying and using SigNoz with load testing.
  - Updated examples to include -var=enable_otel=true.

- Chores
- Introduced required providers to support Helm and Kubernetes
resources.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-10-10 21:53:04 -05:00
Jorge Falcon
e952ef06c0
Loadtesting IAC updates (#32629)
# Github Actions (New)
- New workflow to deploy/destroy loadtest infrastructure with one-click
(Needs to be tested)
- Common inputs drive configuration and deployment of loadtest
infrastructure
    - tag
    - fleet_task_count
    - fleet_task_memory
    - fleet_task_cpu
    - fleet_database_instance_size
    - fleet_database_instance_count
    - fleet_redis_instance_size
    - fleet_redis_instance_count
    - terraform_workspace
    - terraform_action
- New workflow to deploy/destroy osquery-perf to loadtest infrastructure
with one-click (Needs to be tested)
- Common inputs drive configuration and deployment of osquery-perf
resources
    - tag
    - git_branch
    - loadtest_containers
    - extra_flags
    - terraform_workspace
    - terraform_action
- New workflow to deploy shared loadtest resources with one-click (Needs
to be tested)

# Loadtest Infrastructure (New)
- New directory (`infrastructure/loadtesting/terraform/infra`) for
one-click deployment
- Loadtest environment updated to use [fleet-terraform
modules](https://github.com/fleetdm/fleet-terraform)
- [Deployment documentation
updated](0c254bca40/infrastructure/loadtesting/terraform/infra/README.md)
to reflect new steps

# Osquery-perf deployment (New)
- New directory (`infrastructure/loadtesting/terraform/osquery-perf`)
for the deployment of osquery-perf
- osquery-perf updated to use [fleet-terraform
modules](https://github.com/fleetdm/fleet-terraform)
- [Deployment documentation
updated](0c254bca40/infrastructure/loadtesting/terraform/osquery_perf)
to reflect new steps
2025-10-08 15:31:37 -04:00