From 2c4ac5dde7ac8c1afb30da66ae392f378e326616 Mon Sep 17 00:00:00 2001 From: Laughing <61612323+Laughing-q@users.noreply.github.com> Date: Wed, 19 Mar 2025 20:22:31 +0800 Subject: [PATCH] Update YOLOE Docs page (#19776) Signed-off-by: Glenn Jocher Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- docs/en/models/yoloe.md | 18 +++++++++--------- ultralytics/utils/__init__.py | 18 +++++++++++++++++- ultralytics/utils/callbacks/raytune.py | 14 +++++++++++++- 3 files changed, 39 insertions(+), 11 deletions(-) diff --git a/docs/en/models/yoloe.md b/docs/en/models/yoloe.md index 4ca8d463a9..0623103e40 100644 --- a/docs/en/models/yoloe.md +++ b/docs/en/models/yoloe.md @@ -16,7 +16,7 @@ keywords: YOLOE, open-vocabulary detection, real-time object detection, instance The Ultralytics integration for YOLOE is currently under construction 🔨. The usage examples shown in this documentation will work once the integration is complete ✅. Please check back for updates 🔄 or follow our [GitHub repository](https://github.com/ultralytics/ultralytics) 🚀 for the latest developments. -Compared to earlier YOLO models, YOLOE significantly boosts efficiency and accuracy. It improves by **+3.5 AP** over YOLO-Worldv2 on LVIS while using just a third of the training resources and achieving 1.4× faster inference speeds. Fine-tuned on COCO, YOLOE-large surpasses YOLOv8-L by **~0.6 mAP**, using nearly **4× less training time**. This demonstrates YOLOE's exceptional balance of accuracy, efficiency, and versatility. The sections below explore YOLOE's architecture, benchmark comparisons, and integration with the [Ultralytics](https://www.ultralytics.com/) framework. +Compared to earlier YOLO models, YOLOE significantly boosts efficiency and accuracy. It improves by **+3.5 AP** over YOLO-Worldv2 on LVIS while using just a third of the training resources and achieving 1.4× faster inference speeds. Fine-tuned on COCO, YOLOE-v8-large surpasses YOLOv8-L by **0.1 mAP**, using nearly **4× less training time**. This demonstrates YOLOE's exceptional balance of accuracy, efficiency, and versatility. The sections below explore YOLOE's architecture, benchmark comparisons, and integration with the [Ultralytics](https://www.ultralytics.com/) framework. ## Architecture Overview @@ -40,15 +40,15 @@ Crucially, YOLOE's open-world modules introduce **no inference cost** when used YOLOE matches or exceeds the accuracy of closed-set YOLO models on standard benchmarks like COCO, without compromising speed or model size. The table below compares YOLOE-L (built on YOLO11) against corresponding [YOLOv8](https://docs.ultralytics.com/models/yolov8/) and YOLO11 models: -| Model | COCO mAP50-95 | Inference Speed (T4) | Parameters | GFLOPs (640px) | -| ------------------------- | ------------------------ | -------------------------------- | ---------- | ------------------ | -| **YOLOv8-L** (closed-set) | 52.9% | **9.06 ms** (110 FPS) | 43.7 M | 165.2 B | -| **YOLO11-L** (closed-set) | ~53% | **7.7 ms**† (130 FPS) | 26.2 M | 232.0 B | -| **YOLOE-L** (open-vocab) | ~53.5% | **7.7 ms** (130 FPS) | 26.2 M | ~232 B† | +| Model | COCO mAP50-95 | Inference Speed (T4) | Parameters | GFLOPs (640px) | +| ------------------------- | ------------------------ | --------------------- | ---------- | ------------------ | +| **YOLOv8-L** (closed-set) | 52.9% | **9.06 ms** (110 FPS) | 43.7 M | 165.2 B | +| **YOLO11-L** (closed-set) | 53.5% | **6.2 ms** (130 FPS) | 26.2 M | 86.9 B | +| **YOLOE-L** (open-vocab) | 52.6% | **6.2 ms** (130 FPS) | 26.2 M | 86.9 B† | † _YOLO11-L and YOLOE-L have identical architectures (prompt modules disabled in YOLO11-L), resulting in identical inference speed and similar GFLOPs estimates._ -YOLOE-L achieves **~53.5% mAP**, surpassing YOLOv8-L (**52.9%**) with roughly **40% fewer parameters** (26M vs. 43.7M). It processes 640×640 images in **7.7 ms (130 FPS)** compared to YOLOv8-L's **9.06 ms (110 FPS)**, highlighting YOLO11's efficiency. Crucially, YOLOE's open-vocabulary modules incur **no inference cost**, demonstrating a **"no free lunch trade-off"** design. +YOLOE-L achieves **52.6% mAP**, surpassing YOLOv8-L (**52.9%**) with roughly **40% fewer parameters** (26M vs. 43.7M). It processes 640×640 images in **6.2 ms (161 FPS)** compared to YOLOv8-L's **9.06 ms (110 FPS)**, highlighting YOLO11's efficiency. Crucially, YOLOE's open-vocabulary modules incur **no inference cost**, demonstrating a **"no free lunch trade-off"** design. For zero-shot and transfer tasks, YOLOE excels: on LVIS, YOLOE-small improves over YOLO-Worldv2 by **+3.5 AP** using **3× less training resources**. Fine-tuning YOLOE-L from LVIS to COCO also required **4× less training time** than YOLOv8-L, underscoring its efficiency and adaptability. YOLOE further maintains YOLO's hallmark speed, achieving **300+ FPS** on a T4 GPU and **~64 FPS** on iPhone 12 via CoreML, ideal for edge and mobile deployments. @@ -61,10 +61,10 @@ For zero-shot and transfer tasks, YOLOE excels: on LVIS, YOLOE-small improves ov YOLOE introduces notable advancements over prior YOLO models and open-vocabulary detectors: - **YOLOE vs YOLOv5:** - [YOLOv5](yolov5.md) offered good speed-accuracy balance but required retraining for new classes and used anchor-based heads. In contrast, YOLOE is **anchor-free** and dynamically detects new classes. YOLOE, building on YOLOv8's improvements, achieves higher accuracy (~53% vs. YOLOv5's ~50% mAP on COCO) and integrates instance segmentation, unlike YOLOv5. + [YOLOv5](yolov5.md) offered good speed-accuracy balance but required retraining for new classes and used anchor-based heads. In contrast, YOLOE is **anchor-free** and dynamically detects new classes. YOLOE, building on YOLOv8's improvements, achieves higher accuracy (52.6% vs. YOLOv5's ~50% mAP on COCO) and integrates instance segmentation, unlike YOLOv5. - **YOLOE vs YOLOv8:** - YOLOE extends [YOLOv8](yolov8.md)'s redesigned architecture, achieving similar or superior accuracy (**~53.5% mAP with ~26M parameters** vs. YOLOv8-L's **52.9% with ~44M parameters**). It significantly reduces training time due to stronger pre-training. The key advancement is YOLOE's **open-world capability**, detecting unseen objects (e.g., "**bird scooter**" or "**peace symbol**") via prompts, unlike YOLOv8's closed-set design. + YOLOE extends [YOLOv8](yolov8.md)'s redesigned architecture, achieving similar or superior accuracy (**52.6% mAP with ~26M parameters** vs. YOLOv8-L's **52.9% with ~44M parameters**). It significantly reduces training time due to stronger pre-training. The key advancement is YOLOE's **open-world capability**, detecting unseen objects (e.g., "**bird scooter**" or "**peace symbol**") via prompts, unlike YOLOv8's closed-set design. - **YOLOE vs YOLO11:** [YOLO11](yolo11.md) improves upon YOLOv8 with enhanced efficiency and fewer parameters (~22% reduction). YOLOE inherits these gains directly, matching YOLO11's inference speed and parameter count (~26M parameters), while adding **open-vocabulary detection and segmentation**. In closed-set scenarios, YOLOE is equivalent to YOLO11, but crucially adds adaptability to detect unseen classes, achieving **YOLO11 + open-world capability** without compromising speed. diff --git a/ultralytics/utils/__init__.py b/ultralytics/utils/__init__.py index 2fb9ce285b..a42bddb37a 100644 --- a/ultralytics/utils/__init__.py +++ b/ultralytics/utils/__init__.py @@ -995,7 +995,23 @@ def threaded(func): """ Multi-threads a target function by default and returns the thread or function result. - Use as @threaded decorator. The function runs in a separate thread unless 'threaded=False' is passed. + This decorator provides flexible execution of the target function, either in a separate thread or synchronously. + By default, the function runs in a thread, but this can be controlled via the 'threaded=False' keyword argument + which is removed from kwargs before calling the function. + + Args: + func (callable): The function to be potentially executed in a separate thread. + + Returns: + (callable): A wrapper function that either returns a daemon thread or the direct function result. + + Example: + >>> @threaded + ... def process_data(data): + ... return data + >>> + >>> thread = process_data(my_data) # Runs in background thread + >>> result = process_data(my_data, threaded=False) # Runs synchronously, returns function result """ def wrapper(*args, **kwargs): diff --git a/ultralytics/utils/callbacks/raytune.py b/ultralytics/utils/callbacks/raytune.py index b7d10f89fc..5e84135ee1 100644 --- a/ultralytics/utils/callbacks/raytune.py +++ b/ultralytics/utils/callbacks/raytune.py @@ -13,7 +13,19 @@ except (ImportError, AssertionError): def on_fit_epoch_end(trainer): - """Sends training metrics to Ray Tune at end of each epoch.""" + """ + Sends training metrics to Ray Tune at end of each epoch. + + This function checks if a Ray Tune session is active and reports the current training metrics along with the + epoch number to Ray Tune's session. + + Args: + trainer (ultralytics.engine.trainer.BaseTrainer): The Ultralytics trainer object containing metrics and epochs. + + Examples: + >>> # Called automatically by the Ultralytics training loop + >>> on_fit_epoch_end(trainer) + """ if ray.train._internal.session.get_session(): # check if Ray Tune session is active metrics = trainer.metrics session.report({**metrics, **{"epoch": trainer.epoch + 1}})