---
id: observability-otel
title: Observability Overview
sidebar_label: Overview
---
ToolJet supports OpenTelemetry (OTEL) for comprehensive observability, enabling you to monitor application performance, track query executions, and analyze the health of ToolJet services and infrastructure through metrics. OpenTelemetry provides a standardized, vendor-agnostic way to collect observability data, allowing ToolJet to integrate with any OTEL-compatible third-party monitoring tool such as [Datadog](/docs/tj-setup/observability/datadog), [New Relic](/docs/tj-setup/observability/new-relic), and [Grafana](/docs/tj-setup/observability/grafana).
**Categories of Metrics**
1. **App-Based Metrics** - Monitor the performance and reliability of individual ToolJet applications. These metrics include detailed labels such as `app_name`, `query_name`, `environment`, `query_text`, and `query_mode` (SQL/GUI) for fine-grained analysis.
- **Query Executions:** Track total query executions per application
- **Query Duration:** Measure query execution times with histogram buckets
- **Query Failures:** Monitor failed queries with error categorization
- **Success Rates:** Application-level success rate percentages
- **App Usage:** Track application access and interaction events
2. **Platform-Based Metrics** - Monitor the overall health and performance of your ToolJet instance:
- **HTTP Server Metrics:** Request rates, response times, status codes
- **API Performance:** Endpoint-specific latency and throughput
- **Database Operations:** Query execution times and connection health
- **Node.js Runtime:** Event loop delays, garbage collection, memory usage
- **V8 Memory:** Heap usage and external memory tracking
## Configuration
Enable OpenTelemetry by setting the following environment variables in your ToolJet deployment:
#### Required Variables
```js
# Enable OpenTelemetry metrics collection
ENABLE_OTEL=true
# OTLP Endpoint Configuration
OTEL_EXPORTER_OTLP_TRACES=http://localhost:4318/v1/traces
OTEL_EXPORTER_OTLP_METRICS=http://localhost:4318/v1/metrics
# Service Identification
OTEL_SERVICE_NAME=tooljet
```
#### Optional Variables
```js
# Authentication (if required by your OTEL collector)
OTEL_EXPORTER_OTLP_HEADERS=
# Advanced Configuration
OTEL_LOG_LEVEL=debug # Enable debug logging for OTEL
OTEL_ACTIVE_USER_WINDOW_MINUTES=5 # Activity window for concurrent user tracking (default: 5)
OTEL_MAX_TRACKED_USERS=10000 # Maximum tracked users/sessions (default: 10000)
# WARNING: High Cardinality - Only enable for debugging
OTEL_INCLUDE_QUERY_TEXT=false # Include actual query text in metrics (default: false)
# Creates HIGH CARDINALITY - use OTEL Collector to drop in production
```
For a complete list of OpenTelemetry environment variables, refer to the [OpenTelemetry documentation](https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/).
## Setup Local OTEL Collector
Some observability platforms require an OpenTelemetry Collector as an intermediary. In such cases, you can deploy the Collector alongside ToolJet. ToolJet sends telemetry data to the Collector, which then forwards it to your observability platform based on its configuration.
This Docker Compose configuration can be used to deploy an OpenTelemetry Collector alongside ToolJet.
```yaml
# docker-compose.yml excerpt
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4318:4318" # OTLP HTTP receiver
- "8889:8889" # Prometheus exporter
```
## Production Considerations
### High Cardinality Warning
The app-based metrics can optionally include a `query_text` label that contains the actual SQL or query content. **By default, this is disabled** to prevent high cardinality issues.
#### Enabling Query Text (For Debugging Only)
To enable query text in metrics for debugging purposes:
```bash
OTEL_INCLUDE_QUERY_TEXT=true
```
:::warning
Enabling `query_text` creates **high cardinality time series** that can significantly impact Prometheus storage and query performance. Only enable this temporarily for debugging specific query issues.
:::
#### Production Best Practices
If you must enable `query_text` in production:
1. **Use an OTEL Collector** to drop the label before metrics reach Prometheus:
```yaml
# otel-collector-config.yaml
processors:
attributes:
actions:
- key: query_text
action: delete
service:
pipelines:
metrics:
receivers: [otlp]
processors: [attributes]
exporters: [prometheus]
```
2. **Alternative: Hash the query text** to reduce cardinality:
```yaml
processors:
transform:
metric_statements:
- context: datapoint
statements:
- set(attributes["query_text"], SHA256(attributes["query_text"]))
```
### Performance Impact
OpenTelemetry metrics collection has minimal performance impact:
- Metric collection is asynchronous and non-blocking
- Histogram buckets are pre-configured for optimal performance
- Observable gauges (like success rates) are updated on a 15-minute interval
### Sampling and Filtering
For high-volume deployments, consider:
- **Filtering environments:** Only collect metrics from production environments
- **Sampling queries:** Use OTEL Collector sampling for high-frequency queries
- **Aggregation:** Pre-aggregate metrics at the collector level before storage
## Troubleshooting
### Metrics Not Appearing
1. Verify `ENABLE_OTEL=true` is set
2. Check OTEL collector endpoint is reachable:
```bash
curl http://localhost:4318/v1/metrics
curl http://localhost:4318/v1/traces
```
3. Review ToolJet server logs for OTEL connection errors
4. Verify OTEL collector configuration and Prometheus scrape targets
### High Memory Usage
If you experience high memory usage:
1. Remove high-cardinality labels like `query_text` using OTEL Collector processors
2. Reduce histogram bucket counts if needed
3. Implement metric filtering at the collector level
4. Consider using remote write to offload storage
### Missing Labels or Metrics
Ensure you're using ToolJet version 3.16.0-LTS or higher, which includes the full OTEL implementation with both app-based and platform-based metrics.
## Additional Resources
- [OpenTelemetry Documentation](https://opentelemetry.io/docs/)
- [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/)
- [Grafana OTLP Integration](https://grafana.com/docs/grafana-cloud/monitor-infrastructure/otlp/)
- [Prometheus OTLP Receiver](https://prometheus.io/docs/prometheus/latest/feature_flags/#otlp-receiver)