guides/yolo-performance-metrics/ #8790
Replies: 24 comments 52 replies
-
We wish there were illustrative charts for each scale. |
Beta Was this translation helpful? Give feedback.
-
Hello, I've noticed that in the precision curve, beyond the maximum confidence predicted by the model, precision is set to 1. I'm curious as to why it is not set to 0, as it seems more logical to me that when the threshold is too high, the model does not make any predictions, hence setting it to 0 would be more reasonable. Is there any literature discussing the definition of the precision curve? Thank you. |
Beta Was this translation helpful? Give feedback.
-
The train/dfl_loss is also a training result chart that was generated. What is the meaning of this chart? |
Beta Was this translation helpful? Give feedback.
-
does it provide plotting for accuracy, sensitivity and specificity ? |
Beta Was this translation helpful? Give feedback.
-
Firstly, thank you so much for providing so much comprehensive documentation, its is honestly so so helpful. |
Beta Was this translation helpful? Give feedback.
-
I want to know when I use the val command, does the output of mAP50 for a single class in fact refer to AP50 for a single class? Because mAP50 theoretically evaluate all classes. |
Beta Was this translation helpful? Give feedback.
-
hello! Why is mAP50 can evaluate yolov8 model and what it is represente in reality ? Thank you |
Beta Was this translation helpful? Give feedback.
-
Request for Additional Classification MetricsDear YOLOv8 Team, Thank you for your excellent work on YOLOv8! Currently, the classification module provides top-1 and top-5 accuracy metrics, which are very useful. However, I believe the inclusion of additional standard classification metrics would further enhance the framework's utility. Suggested Metrics:
Including these metrics will provide a more comprehensive understanding of model performance, aiding in better model evaluation and tuning. |
Beta Was this translation helpful? Give feedback.
-
I am currently working on my dissertation, for which I have trained ten YOLOv8x models using camera trap data. I have a few questions regarding the results.csv file that is automatically generated after model training. Could you please clarify whether the precision, recall, mAP50, and mAP95 values recorded in the results.csv file pertain to the training set or the validation set? As I am preparing to present the results of the models I have trained, I would appreciate your guidance on which values to report as the final results. Given that I trained the models for 100 epochs, there are multiple values for these parameters across the epochs. Could you advise on how best to interpret and present these results? |
Beta Was this translation helpful? Give feedback.
-
Hello Ultralytics! Was wondering if you had the training curves related the the pretrained weights for Yolo v8 and v10? ( Which, if I understand things right, were trained on COCO ) ( i.e. Am particulary interested in the comparison of the o2m components compared to the o2o during v10 training, as well as compared to their equivalents in v8. Can try to reproduce them myself, but thought they might be easily available? Cheers! |
Beta Was this translation helpful? Give feedback.
-
How are you calculating class-wise precision? In my case, I have 1 true positive (TP) and 2 false positives (FP). Using the formula for precision (TP / (TP + FP)), I expect the precision to be 0.33. However, I am getting a precision value of 0.734 instead. Could you explain why? |
Beta Was this translation helpful? Give feedback.
-
I am sharing my confusion matrix with you. My IoU confidence threshold is
set to 0.5. Based on the confusion matrix for the class 'Car', there is 1
true positive and 2 false positives, so the precision should be 0.33.
However, when I evaluate it using Ultralytics, the precision is reported as
0.734. I am passing only 1 image in val. Could you explain the discrepancy?
…On Mon, Sep 9, 2024 at 2:14 AM Glenn Jocher ***@***.***> wrote:
@Shubham77saini <https://github.com/Shubham77saini> the discrepancy in
your precision calculation might be due to how the model handles confidence
thresholds or other internal settings. Precision is calculated as TP / (TP
+ FP), but factors like confidence thresholds can affect which detections
are considered true positives or false positives. You might want to check
the confidence threshold settings or review the detailed metrics output for
more insights. For further details, you can refer to the YOLOv8
documentation on performance metrics.
—
Reply to this email directly, view it on GitHub
<#8790 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVGNPVMZM5JDG4YIVSUJP4DZVSZKHAVCNFSM6AAAAABEMUQDPWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANJYGQZTGNI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
Hi everyone! I’d like to get your thoughts on the approach we use for evaluating machine learning models. We start by calculating the F1 Score for each class, and then we take the harmonic mean of these scores. This method aims to provide a more accurate measure for model evaluation. Interestingly, the harmonic mean is more sensitive to lower scores, meaning that if any one score is low, it significantly affects the final score. Next, we again use the harmonic mean to calculate the final score, which considers both the F1 Score mean and the mAP. However, it’s important to note that mAP at 50-95 doesn’t provide very high accuracy for evaluation. In my dataset, I also need to ensure that recall and precision are thoroughly assessed. What do you think about this method? Do you find it suitable? |
Beta Was this translation helpful? Give feedback.
-
Hello, my model's mAP0.5 is 95.5%, with precision and recall of 89.7% and 89.6% respectively (iou=0.7), indicating a significant difference. I adjusted the IoU to 0.6 again, but it still remains the same. How should I adjust? Why is mAP0.5 so much larger than precision and recall? |
Beta Was this translation helpful? Give feedback.
-
Question: Clarification on AP vs mAP in YOLOv8 metrics output Hello, I'm using YOLOv8 for object detection, and I have some questions about the metrics output, particularly regarding the use of map50 and map50-90 in the results. In the output, I'm seeing these metrics reported for individual classes as well as an "all" category that aggregates the performance. My understanding is that: AP (Average Precision) should be calculated for each individual class. mAP (mean Average Precision) is the average of the AP values across all classes. However, in the output, the terms map50 and map50-90 are listed for each individual class. Should these values be interpreted as AP for each class, rather than mAP, which I would expect to be the mean across all classes? Additionally, the "all" row appears to be giving the mAP value. Could you confirm:
Thanks in advance for your clarification! |
Beta Was this translation helpful? Give feedback.
-
It would be usefull to have also Average Recall in output, in addition to the Average Precision. Please take this into consideration! In many tasks, it is more important to detect as many objects as possible rather than detect fewer but more accurately. And as you know, the Recall metric alone is not enough. Thank you :) |
Beta Was this translation helpful? Give feedback.
-
Hi folk! Thanks for the framework! Last time i used yolo5 source code and can say that the newly architectured framework way more convenient. I am wondering, what is the strategy if I want to step away from the standard algorithm to select the "best" model. Let's say, to change somehow fitness() function or choose based on F1/ROC AuC? |
Beta Was this translation helpful? Give feedback.
-
Hello, may I ask how to calculate FPS? My calculation method is, for example, through model. val (split="test"), output "0.5ms pre-processing, 3.2ms inference, 0.0ms loss, 1.2ms post-processing per image", calculate FPS=1000/(0.5+3.2+1.2), is this correct? If that's the case, I might have different values after measuring twice in a row. What's going on? |
Beta Was this translation helpful? Give feedback.
-
Question: Suppose we have annotated 10 objects in an image, but there are actually 20 objects present. When we use a validation set to calculate metrics, if the model labels objects that have not been annotated, will this affect the model's performance? Which metrics will be impacted? |
Beta Was this translation helpful? Give feedback.
-
I've successfully trained a YOLOv8 model. Now, I want to evaluate its performance on a separate labeled test dataset. I'm interested in metrics like precision-recall curves. I remember seeing graphs generated during training. Can I produce similar graphs for my test dataset? (is there any callable method?) |
Beta Was this translation helpful? Give feedback.
-
The question I want to ask here is whether the Precision and Recall values are the bounding box Precision and Recall (like the overall box precision across all classes), or if they represent the class-specific Precision calculated as |
Beta Was this translation helpful? Give feedback.
-
Hello YOLO ultralytics team, I have my model ready, I have trained the model for 30,000 objects on 3165 images, for 300 epochs and the validation statistics are as follows; You can see, still the False Possitive rate is high, what are some ways to fix it? and also, when should I stop in terms of training and locking my final trained model, so that I ca start deploying it on the whole dataset? In other words, at which stage I can trust my model is perfectly trained? |
Beta Was this translation helpful? Give feedback.
-
I want to ask, why when running validation with the following command:
the result is different from the output confusion_matrix.png, the result is lower. Then what does Box P mean in the validation output? Look at the pistol class, why are the results different? |
Beta Was this translation helpful? Give feedback.
-
Hello Team, |
Beta Was this translation helpful? Give feedback.
-
guides/yolo-performance-metrics/
A comprehensive guide on various performance metrics related to YOLOv8, their significance, and how to interpret them.
https://docs.ultralytics.com/guides/yolo-performance-metrics/
Beta Was this translation helpful? Give feedback.
All reactions