Performance Analysis of Glass Surface Detection Based on the YOLOs
##plugins.themes.bootstrap3.article.main##
Abstract
Glass surface detection poses a significant challenge in computer vision due to transparency, reflections, and complex optical effects. This study systematically evaluates the performance of seven YOLO models (YOLOv5 to YOLOv12) for glass surface detection tasks, comparing their detection accuracy, computational efficiency, and real-time capabilities on a unified dataset. Experimental results demonstrate that YOLOv9 achieves the highest detection accuracy with 81.1% precision and 79.6% mean average precision at IoU 0.5 (mAP@0.5). However, its deep architecture (489 layers) leads to lower inference speed (500 FPS) and longer training duration (12.08 hours). YOLOv11 strikes the best balance between efficiency and accuracy, offering the lowest computational cost (6.3 GFLOPs) with competitive recall (71.9%) and F1-score (0.76), making it suitable for edge computing scenarios. YOLOv6 (666 FPS) and YOLOv8 (588 FPS) excel in real-time performance but require trade-offs in terms of missed detection rate (25.2%) and false positive rate (7.3%). This work explores optimization directions for the YOLO series in lightweight architecture design, multimodal fusion, and adaptability to data distributions, providing theoretical support for deploying perception systems in mobile robotics. We have open-sourced our dataset at https://github.com/chaoyanSEU/Glass_surface_detect_dataset to benefit the research community.