diff --git a/docs/en/macros/export-args.md b/docs/en/macros/export-args.md
index 73dcf6c984..6e6536f4f4 100644
--- a/docs/en/macros/export-args.md
+++ b/docs/en/macros/export-args.md
@@ -1,16 +1,16 @@
-| Argument    | Type              | Default         | Description                                                                                                                                                                                   |
-| ----------- | ----------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `format`    | `str`             | `'torchscript'` | Target format for the exported model, such as `'onnx'`, `'torchscript'`, `'tensorflow'`, or others, defining compatibility with various deployment environments.                              |
-| `imgsz`     | `int` or `tuple`  | `640`           | Desired image size for the model input. Can be an integer for square images or a tuple `(height, width)` for specific dimensions.                                                             |
-| `keras`     | `bool`            | `False`         | Enables export to Keras format for [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) SavedModel, providing compatibility with TensorFlow serving and APIs.                        |
-| `optimize`  | `bool`            | `False`         | Applies optimization for mobile devices when exporting to TorchScript, potentially reducing model size and improving performance.                                                             |
-| `half`      | `bool`            | `False`         | Enables FP16 (half-precision) quantization, reducing model size and potentially speeding up inference on supported hardware.                                                                  |
-| `int8`      | `bool`            | `False`         | Activates INT8 quantization, further compressing the model and speeding up inference with minimal [accuracy](https://www.ultralytics.com/glossary/accuracy) loss, primarily for edge devices. |
-| `dynamic`   | `bool`            | `False`         | Allows dynamic input sizes for ONNX, TensorRT and OpenVINO exports, enhancing flexibility in handling varying image dimensions.                                                               |
-| `simplify`  | `bool`            | `True`          | Simplifies the model graph for ONNX exports with `onnxslim`, potentially improving performance and compatibility.                                                                             |
-| `opset`     | `int`             | `None`          | Specifies the ONNX opset version for compatibility with different ONNX parsers and runtimes. If not set, uses the latest supported version.                                                   |
-| `workspace` | `float` or `None` | `None`          | Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance; use `None` for auto-allocation by TensorRT up to device maximum.                   |
-| `nms`       | `bool`            | `False`         | Adds Non-Maximum Suppression (NMS) to the exported model when supported (see [Export Formats](https://docs.ultralytics.com/modes/export/)), improving detection post-processing efficiency.   |
-| `batch`     | `int`             | `1`             | Specifies export model batch inference size or the max number of images the exported model will process concurrently in `predict` mode.                                                       |
-| `device`    | `str`             | `None`          | Specifies the device for exporting: GPU (`device=0`), CPU (`device=cpu`), MPS for Apple silicon (`device=mps`) or DLA for NVIDIA Jetson (`device=dla:0` or `device=dla:1`).                   |
-| `data`      | `str`             | `'coco8.yaml'`  | Path to the [dataset](https://docs.ultralytics.com/datasets/) configuration file (default: `coco8.yaml`), essential for quantization.                                                         |
+| Argument    | Type              | Default         | Description                                                                                                                                                                                                                                                                                                                                                |
+| ----------- | ----------------- | --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `format`    | `str`             | `'torchscript'` | Target format for the exported model, such as `'onnx'`, `'torchscript'`, `'engine'` (TensorRT), or others. Each format enables compatibility with different [deployment environments](https://docs.ultralytics.com/modes/export/).                                                                                                                         |
+| `imgsz`     | `int` or `tuple`  | `640`           | Desired image size for the model input. Can be an integer for square images (e.g., `640` for 640×640) or a tuple `(height, width)` for specific dimensions.                                                                                                                                                                                                |
+| `keras`     | `bool`            | `False`         | Enables export to Keras format for [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) SavedModel, providing compatibility with TensorFlow serving and APIs.                                                                                                                                                                                     |
+| `optimize`  | `bool`            | `False`         | Applies optimization for mobile devices when exporting to TorchScript, potentially reducing model size and improving [inference](https://docs.ultralytics.com/modes/predict/) performance. Not compatible with NCNN format or CUDA devices.                                                                                                                |
+| `half`      | `bool`            | `False`         | Enables FP16 (half-precision) quantization, reducing model size and potentially speeding up inference on supported hardware. Not compatible with INT8 quantization or CPU-only exports for ONNX.                                                                                                                                                           |
+| `int8`      | `bool`            | `False`         | Activates INT8 quantization, further compressing the model and speeding up inference with minimal [accuracy](https://www.ultralytics.com/glossary/accuracy) loss, primarily for [edge devices](https://www.ultralytics.com/blog/understanding-the-real-world-applications-of-edge-ai). When used with TensorRT, performs post-training quantization (PTQ). |
+| `dynamic`   | `bool`            | `False`         | Allows dynamic input sizes for ONNX, TensorRT and OpenVINO exports, enhancing flexibility in handling varying image dimensions. Automatically set to `True` when using TensorRT with INT8.                                                                                                                                                                 |
+| `simplify`  | `bool`            | `True`          | Simplifies the model graph for ONNX exports with `onnxslim`, potentially improving performance and compatibility with inference engines.                                                                                                                                                                                                                   |
+| `opset`     | `int`             | `None`          | Specifies the ONNX opset version for compatibility with different [ONNX](https://docs.ultralytics.com/integrations/onnx/) parsers and runtimes. If not set, uses the latest supported version.                                                                                                                                                             |
+| `workspace` | `float` or `None` | `None`          | Sets the maximum workspace size in GiB for [TensorRT](https://docs.ultralytics.com/integrations/tensorrt/) optimizations, balancing memory usage and performance. Use `None` for auto-allocation by TensorRT up to device maximum.                                                                                                                         |
+| `nms`       | `bool`            | `False`         | Adds Non-Maximum Suppression (NMS) to the exported model when supported (see [Export Formats](https://docs.ultralytics.com/modes/export/)), improving detection post-processing efficiency. Not available for end2end models.                                                                                                                              |
+| `batch`     | `int`             | `1`             | Specifies export model batch inference size or the maximum number of images the exported model will process concurrently in `predict` mode. For Edge TPU exports, this is automatically set to 1.                                                                                                                                                          |
+| `device`    | `str`             | `None`          | Specifies the device for exporting: GPU (`device=0`), CPU (`device=cpu`), MPS for Apple silicon (`device=mps`) or DLA for NVIDIA Jetson (`device=dla:0` or `device=dla:1`). TensorRT exports automatically use GPU.                                                                                                                                        |
+| `data`      | `str`             | `'coco8.yaml'`  | Path to the [dataset](https://docs.ultralytics.com/datasets/) configuration file (default: `coco8.yaml`), essential for INT8 quantization calibration. If not specified with INT8 enabled, a default dataset will be assigned.                                                                                                                             |
diff --git a/docs/en/macros/predict-args.md b/docs/en/macros/predict-args.md
index 81d1705930..45c45d6b15 100644
--- a/docs/en/macros/predict-args.md
+++ b/docs/en/macros/predict-args.md
@@ -18,3 +18,5 @@
 | `embed`         | `list[int]`      | `None`                 | Specifies the layers from which to extract feature vectors or [embeddings](https://www.ultralytics.com/glossary/embeddings). Useful for downstream tasks like clustering or similarity search.                                                                                                                  |
 | `project`       | `str`            | `None`                 | Name of the project directory where prediction outputs are saved if `save` is enabled.                                                                                                                                                                                                                          |
 | `name`          | `str`            | `None`                 | Name of the prediction run. Used for creating a subdirectory within the project folder, where prediction outputs are stored if `save` is enabled.                                                                                                                                                               |
+| `stream`        | `bool`           | `False`                | Enables memory-efficient processing for long videos or numerous images by returning a generator of Results objects instead of loading all frames into memory at once.                                                                                                                                           |
+| `verbose`       | `bool`           | `True`                 | Controls whether to display detailed inference logs in the terminal, providing real-time feedback on the prediction process.                                                                                                                                                                                    |
diff --git a/docs/en/macros/sam-auto-annotate.md b/docs/en/macros/sam-auto-annotate.md
index ce9e1956c8..1f44f3b175 100644
--- a/docs/en/macros/sam-auto-annotate.md
+++ b/docs/en/macros/sam-auto-annotate.md
@@ -2,7 +2,7 @@
 | ------------ | ----------- | -------------- | ------------------------------------------------------------------------------------ |
 | `data`       | `str`       | required       | Path to directory containing target images for annotation or segmentation.           |
 | `det_model`  | `str`       | `'yolo11x.pt'` | YOLO detection model path for initial object detection.                              |
-| `sam_model`  | `str`       | `'sam2_b.pt'`  | SAM model path for segmentation (supports SAM, SAM2 variants and mobile_sam models). |
+| `sam_model`  | `str`       | `'sam_b.pt'`   | SAM model path for segmentation (supports SAM, SAM2 variants and mobile_sam models). |
 | `device`     | `str`       | `''`           | Computation device (e.g., 'cuda:0', 'cpu', or '' for automatic device detection).    |
 | `conf`       | `float`     | `0.25`         | YOLO detection confidence threshold for filtering weak detections.                   |
 | `iou`        | `float`     | `0.45`         | IoU threshold for Non-Maximum Suppression to filter overlapping boxes.               |
diff --git a/docs/en/macros/validation-args.md b/docs/en/macros/validation-args.md
index 4ae4f305cb..16b7e3aaac 100644
--- a/docs/en/macros/validation-args.md
+++ b/docs/en/macros/validation-args.md
@@ -1,23 +1,26 @@
-| Argument      | Type    | Default | Description                                                                                                                                                                                                                           |
-| ------------- | ------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `data`        | `str`   | `None`  | Specifies the path to the dataset configuration file (e.g., `coco8.yaml`). This file includes paths to [validation data](https://www.ultralytics.com/glossary/validation-data), class names, and number of classes.                   |
-| `imgsz`       | `int`   | `640`   | Defines the size of input images. All images are resized to this dimension before processing.                                                                                                                                         |
-| `batch`       | `int`   | `16`    | Sets the number of images per batch. The value must be a positive integer.                                                                                                                                                            |
-| `save_json`   | `bool`  | `False` | If `True`, saves the results to a JSON file for further analysis or integration with other tools.                                                                                                                                     |
-| `save_hybrid` | `bool`  | `False` | If `True`, saves a hybrid version of labels that combines original annotations with additional model predictions. Only works with detection models.                                                                                   |
-| `conf`        | `float` | `0.001` | Sets the minimum confidence threshold for detections. Detections with confidence below this threshold are discarded.                                                                                                                  |
-| `iou`         | `float` | `0.6`   | Sets the [Intersection Over Union](https://www.ultralytics.com/glossary/intersection-over-union-iou) (IoU) threshold for Non-Maximum Suppression (NMS). Helps in reducing duplicate detections.                                       |
-| `max_det`     | `int`   | `300`   | Limits the maximum number of detections per image. Useful in dense scenes to prevent excessive detections.                                                                                                                            |
-| `half`        | `bool`  | `True`  | Enables half-[precision](https://www.ultralytics.com/glossary/precision) (FP16) computation, reducing memory usage and potentially increasing speed with minimal impact on [accuracy](https://www.ultralytics.com/glossary/accuracy). |
-| `device`      | `str`   | `None`  | Specifies the device for validation (`cpu`, `cuda:0`, etc.). Allows flexibility in utilizing CPU or GPU resources.                                                                                                                    |
-| `dnn`         | `bool`  | `False` | If `True`, uses the [OpenCV](https://www.ultralytics.com/glossary/opencv) DNN module for ONNX model inference, offering an alternative to [PyTorch](https://www.ultralytics.com/glossary/pytorch) inference methods.                  |
-| `plots`       | `bool`  | `False` | When set to `True`, generates and saves plots of predictions versus ground truth for visual evaluation of the model's performance.                                                                                                    |
-| `rect`        | `bool`  | `True`  | If `True`, uses rectangular inference for batching, reducing padding and potentially increasing speed and efficiency.                                                                                                                 |
-| `split`       | `str`   | `'val'` | Determines the dataset split to use for validation (`val`, `test`, or `train`). Allows flexibility in choosing the data segment for performance evaluation.                                                                           |
-| `project`     | `str`   | `None`  | Name of the project directory where validation outputs are saved.                                                                                                                                                                     |
-| `name`        | `str`   | `None`  | Name of the validation run. Used for creating a subdirectory within the project folder, where validation logs and outputs are stored.                                                                                                 |
-| `verbose`     | `bool`  | `False` | If `True`, displays detailed information during the validation process, including per-class metrics and additional debugging information.                                                                                             |
-| `save_txt`    | `bool`  | `False` | If `True`, saves detection results in text files, with one file per image, useful for further analysis or custom post-processing.                                                                                                     |
-| `save_conf`   | `bool`  | `False` | If `True`, includes confidence values in the saved text files when `save_txt` is enabled, providing more detailed output for analysis.                                                                                                |
-| `save_crop`   | `bool`  | `False` | If `True`, saves cropped images of detected objects, which can be useful for creating focused datasets or visual verification.                                                                                                        |
-| `workers`     | `int`   | `8`     | Number of worker threads for data loading. Setting to 0 uses main thread, which can be more stable in some environments but slower.                                                                                                   |
+| Argument       | Type    | Default | Description                                                                                                                                                                                                                                               |
+| -------------- | ------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `data`         | `str`   | `None`  | Specifies the path to the dataset configuration file (e.g., `coco8.yaml`). This file includes paths to [validation data](https://www.ultralytics.com/glossary/validation-data), class names, and number of classes.                                       |
+| `imgsz`        | `int`   | `640`   | Defines the size of input images. All images are resized to this dimension before processing. Larger sizes may improve accuracy for small objects but increase computation time.                                                                          |
+| `batch`        | `int`   | `16`    | Sets the number of images per batch. Higher values utilize GPU memory more efficiently but require more VRAM. Adjust based on available hardware resources.                                                                                               |
+| `save_json`    | `bool`  | `False` | If `True`, saves the results to a JSON file for further analysis, integration with other tools, or submission to evaluation servers like COCO.                                                                                                            |
+| `save_hybrid`  | `bool`  | `False` | If `True`, saves a hybrid version of labels that combines original annotations with additional model predictions. Useful for semi-supervised learning and dataset enhancement.                                                                            |
+| `conf`         | `float` | `0.001` | Sets the minimum confidence threshold for detections. Lower values increase recall but may introduce more false positives. Used during [validation](https://docs.ultralytics.com/modes/val/) to compute precision-recall curves.                          |
+| `iou`          | `float` | `0.6`   | Sets the [Intersection Over Union](https://www.ultralytics.com/glossary/intersection-over-union-iou) threshold for [Non-Maximum Suppression](https://www.ultralytics.com/glossary/non-maximum-suppression-nms). Controls duplicate detection elimination. |
+| `max_det`      | `int`   | `300`   | Limits the maximum number of detections per image. Useful in dense scenes to prevent excessive detections and manage computational resources.                                                                                                             |
+| `half`         | `bool`  | `True`  | Enables half-[precision](https://www.ultralytics.com/glossary/precision) (FP16) computation, reducing memory usage and potentially increasing speed with minimal impact on [accuracy](https://www.ultralytics.com/glossary/accuracy).                     |
+| `device`       | `str`   | `None`  | Specifies the device for validation (`cpu`, `cuda:0`, etc.). When `None`, automatically selects the best available device. Multiple CUDA devices can be specified with comma separation.                                                                  |
+| `dnn`          | `bool`  | `False` | If `True`, uses the [OpenCV](https://www.ultralytics.com/glossary/opencv) DNN module for ONNX model inference, offering an alternative to [PyTorch](https://www.ultralytics.com/glossary/pytorch) inference methods.                                      |
+| `plots`        | `bool`  | `False` | When set to `True`, generates and saves plots of predictions versus ground truth, confusion matrices, and PR curves for visual evaluation of model performance.                                                                                           |
+| `rect`         | `bool`  | `True`  | If `True`, uses rectangular inference for batching, reducing padding and potentially increasing speed and efficiency by processing images in their original aspect ratio.                                                                                 |
+| `split`        | `str`   | `'val'` | Determines the dataset split to use for validation (`val`, `test`, or `train`). Allows flexibility in choosing the data segment for performance evaluation.                                                                                               |
+| `project`      | `str`   | `None`  | Name of the project directory where validation outputs are saved. Helps organize results from different experiments or models.                                                                                                                            |
+| `name`         | `str`   | `None`  | Name of the validation run. Used for creating a subdirectory within the project folder, where validation logs and outputs are stored.                                                                                                                     |
+| `verbose`      | `bool`  | `False` | If `True`, displays detailed information during the validation process, including per-class metrics, batch progress, and additional debugging information.                                                                                                |
+| `save_txt`     | `bool`  | `False` | If `True`, saves detection results in text files, with one file per image, useful for further analysis, custom post-processing, or integration with other systems.                                                                                        |
+| `save_conf`    | `bool`  | `False` | If `True`, includes confidence values in the saved text files when `save_txt` is enabled, providing more detailed output for analysis and filtering.                                                                                                      |
+| `save_crop`    | `bool`  | `False` | If `True`, saves cropped images of detected objects, which can be useful for creating focused datasets, visual verification, or further analysis of individual detections.                                                                                |
+| `workers`      | `int`   | `8`     | Number of worker threads for data loading. Higher values can speed up data preprocessing but may increase CPU usage. Setting to 0 uses main thread, which can be more stable in some environments.                                                        |
+| `augment`      | `bool`  | `False` | Enables test-time augmentation (TTA) during validation, potentially improving detection accuracy at the cost of inference speed by running inference on transformed versions of the input.                                                                |
+| `agnostic_nms` | `bool`  | `False` | Enables class-agnostic [Non-Maximum Suppression](https://www.ultralytics.com/glossary/non-maximum-suppression-nms), which merges overlapping boxes regardless of their predicted class. Useful for instance-focused applications.                         |
+| `single_cls`   | `bool`  | `False` | Treats all classes as a single class during validation. Useful for evaluating model performance on binary detection tasks or when class distinctions aren't important.                                                                                    |
diff --git a/docs/en/macros/visualization-args.md b/docs/en/macros/visualization-args.md
index b5b6f09b2b..50130dd3b2 100644
--- a/docs/en/macros/visualization-args.md
+++ b/docs/en/macros/visualization-args.md
@@ -22,7 +22,8 @@
     "masks": ["bool", "True", "Display segmentation masks in the visualization output."],
     "probs": ["bool", "True", "Include classification probabilities in the visualization."],
     "filename": ["str", "None", "Path and filename to save the annotated image when `save=True`."],
-    "color_mode": ["str", "'class'", "Specify the coloring mode for visualizations, e.g., 'instance' or 'class'."]
+    "color_mode": ["str", "'class'", "Specify the coloring mode for visualizations, e.g., 'instance' or 'class'."],
+    "txt_color": ["tuple[int, int, int]", "(255, 255, 255)", "RGB text color for classification task annotations."]
 } %}
 
 {%- if not params %}