<!-- Add the related story/sub-task/bug number, like Resolves #123, or remove if NA --> **Related issue:** Resolves #32331 This PR allows us to run loadtest with SigNoz OTEL backend by adding `-var=enable_otel=true` SigNoz is deployed via Helm chart. Enhancements needed (in future PR): - put SigNoz UI behind VPN - combine the new eks-vpc with shared fleet-vpc - make SigNoz shared, so multiple loadtests use the same instance? (But what about updating to it to latest version?) Next steps: - Enable SigNoz in Dogfood environment - SigNoz by default [keeps 15 days of logs and traces](https://signoz.io/docs/userguide/retention-period), which is quite a bit. How much would that cost us and should we reduce it? <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Optional OpenTelemetry tracing with SigNoz via a new enable_otel flag. - Conditional deployment of a SigNoz stack (managed EKS, storage, Helm-based apps) with internal OTLP collector endpoint. - New outputs to retrieve OTLP endpoint, cluster name, and a kubectl configuration command. - Documentation - Added guidance for deploying and using SigNoz with load testing. - Updated examples to include -var=enable_otel=true. - Chores - Introduced required providers to support Helm and Kubernetes resources. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
12 KiB
Deploy Loadtesting Infrastructure
Before we begin
Although deployments through the github action should be prioritized, for manual deployments you will need.
- Terraform v1.10.2
- Docker
- Go
Additionally, refer to the Reference Architecture sizing recommendations for loadtest infrastructure sizing.
Deploy with Github Actions (Coming Soon)
Deploy/Destroy environment with Github Action
-
On the top right corner, select the
Run Workflowdropdown. -
Fill out the details for the deployment.
-
After all details have been filled out, you will hit the green
Run Workflowbutton, directly under the inputs. Forterraform_actionselectPlan,Apply, orDestroy.- Plan will show you the results of a dry-run
- Apply will deploy changes to the environment
- Destroy will destroy your environment
Deploy environment manually
-
Clone the repository
-
Initialize terraform
terraform init -
Create a new the terraform workspace or select an existing workspace for your environment. The terraform workspace will be used in different area's of Terraform to drive uniqueness and access to the environment.
terraform workspace new <workspace_name>or, if your workspace already exists
terraform workspace list terraform workspace select <workspace_name> -
Ensure that your new or existing workspace is in use.
terraform workspace show -
Deploy the environment (will also trigger migrations automatically)
Note: Terraform will prompt you for confirmation to trigger the deployment. If everything looks ok, submitting
yeswill trigger the deployment.terraform apply -var=tag=v4.72.0or, you can add the additional supported terraform variables, to overwrite the default values. You can choose which ones are included/overwritten. If a variable is not defined, the default value configured in ./variables.tf is used.
Below is an example with all available variables.
terraform apply -var=tag=v4.72.0 -var=fleet_task_count=20 -var=fleet_task_memory=4096 -var=fleet_task_cpu=512 -var=database_instance_size=db.t4g.large -var=database_instance_count=3 -var=redis_instance_size=cache.t4g.small -var=redis_instance_count=3 -var=enable_otel=true
OpenTelemetry tracing with SigNoz
By default, the loadtest environment uses Elastic APM. You can optionally use OpenTelemetry with SigNoz instead by setting enable_otel=true:
terraform apply -var=tag=v4.72.0 -var=enable_otel=true
This deploys both Fleet and SigNoz in a single command. See ../signoz/README.md for architecture details.
Accessing the SigNoz UI
After deploying with enable_otel=true, get the SigNoz UI URL:
$(terraform output -raw signoz_configure_kubectl) && kubectl get svc signoz -n signoz -o jsonpath='http://{.status.loadBalancer.ingress[0].hostname}:8080'
Destroy environment manually
-
Clone the repository (if not already cloned)
-
Initialize terraform
terraform init -
Select your workspace
terraform workspace list terraform workspace select <workspace_name> -
Destroy the environment
terraform destroy
Delete the workspace
Once all resources have been removed from the terraform workspace, remove the terraform workspace.
terraform workspace delete <workspace_name>
Requirements
| Name | Version |
|---|---|
| aws | >= 5.68.0 |
| docker | ~> 2.16.0 |
| git | ~> 0.1.0 |
Providers
| Name | Version |
|---|---|
| aws | 6.14.1 |
| docker | 2.16.0 |
| git | 0.1.0 |
| random | 3.7.2 |
| terraform | n/a |
| tls | 4.1.0 |
Modules
| Name | Source | Version |
|---|---|---|
| acm | terraform-aws-modules/acm/aws | 4.3.1 |
| loadtest | github.com/fleetdm/fleet-terraform//byo-vpc | tf-mod-root-v1.18.3 |
| logging_alb | github.com/fleetdm/fleet-terraform//addons/logging-alb | tf-mod-addon-logging-alb-v1.6.1 |
| logging_firehose | github.com/fleetdm/fleet-terraform//addons/logging-destination-firehose | tf-mod-addon-logging-destination-firehose-v1.2.4 |
| mdm | github.com/fleetdm/fleet-terraform/addons/mdm?depth=1&ref=tf-mod-addon-mdm-v2.0.0 | n/a |
| migrations | github.com/fleetdm/fleet-terraform//addons/migrations | tf-mod-addon-migrations-v2.1.0 |
| osquery-carve | github.com/fleetdm/fleet-terraform//addons/osquery-carve | tf-mod-addon-osquery-carve-v1.1.1 |
| ses | github.com/fleetdm/fleet-terraform//addons/ses | tf-mod-addon-ses-v1.4.0 |
| vuln-processing | github.com/fleetdm/fleet-terraform//addons/external-vuln-scans | tf-mod-addon-external-vuln-scans-v2.3.0 |
Resources
Inputs
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| database_instance_count | The number of Aurora database instances | number |
2 |
no |
| database_instance_size | The instance size for Aurora database instances | string |
"db.t4g.medium" |
no |
| fleet_task_count | The total number (max) that ECS can scale Fleet containers up to | number |
5 |
no |
| fleet_task_cpu | The CPU configuration for Fleet containers | number |
512 |
no |
| fleet_task_memory | The memory configuration for Fleet containers | number |
4096 |
no |
| redis_instance_count | The number of Elasticache nodes | number |
3 |
no |
| redis_instance_size | The instance size for Elasticache nodes | string |
"cache.t4g.micro" |
no |
| tag | The tag to deploy. This would be the same as the branch name | string |
"v4.72.0" |
no |
Outputs
| Name | Description |
|---|---|
| ecs_arn | n/a |
| ecs_cluster | n/a |
| ecs_execution_arn | n/a |
| enroll_secret_arn | n/a |
| internal_alb_dns_name | n/a |
| kms_key_id | n/a |
| logging_config | n/a |
| security_groups | n/a |
| server_url | n/a |