A Smooth Object Specular Highlight Removal Method based on pix2pixHD Model

Xin Du; Xiaogang Wang; Zhiqiang Feng; Keyu Chen; Guoquan Liu

Journal of Vibration Testing and System Dynamics

C. Steve Suh (editor), Pawel Olejnik (editor),

Xianguo Tuo (editor)

A Smooth Object Specular Highlight Removal Method based on pix2pixHD Model

Journal of Vibration Testing and System Dynamics 9(4) (2025) 347--360 | DOI:10.5890/JVTSD.2025.12.004

Xin Du$^{1,2}$, Xiaogang Wang$^{1,2}$, Zhiqiang Feng$^{1,2}$, Keyu Chen$^{1,2}$, Guoquan Liu$^{2,3}$

$^{1}$ School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin, 644000, China

$^{2}$ Artificial Intelligence Key Laboratory of Sichuan province, Yibin, 644000, China

$^{3}$ School of Mechanical and Electronic Engineering, East China University of Technology, Nanchang, 330013, China

Download Full Text PDF

Abstract

High-quality images are fundamental to computer vision tasks. However, in practical applications, the presence of specular highlights obscures the texture and color information on object surfaces, significantly degrading image quality. Existing highlight removal methods either have poor highlight removal effects or, when processing images in which original features of objects exist in highlight areas, remove both the highlights and the original features of the objects. To address this issue, this paper proposes a highlight removal method for smooth objects based on the pix2pixHD model. First, a highlight dataset is created based on the characteristics of the pix2pixHD model, where each highlight image is paired with a corresponding real highlight-free image. Second, a highlight removal network is designed using pix2pixHD as the base framework. During training, discriminators at two scales are introduced to evaluate the highlight-free images generated by the generator, providing more comprehensive feedback to the generator. This encourages the generator to preserve the original features of objects beneath highlighted areas while converting highlight images to highlight-free images. Furthermore, a perceptual loss function is incorporated to enhance the visual quality of the generated highlight-free images. Experimental results demonstrate that the proposed method can effectively remove specular highlights visually while preserving the original features of objects beneath highlighted areas. Additionally, it exhibits favorable performance in terms of PSNR and SSIM metrics.

References

[1] Gevers, T. and Stokman, H. (2003), Classifying color edges in video into shadow-geometry, highlight, or material transitions, IEEE Transactions on Multimedia, 5(2), 237-243.

[2]	Bhagya, C. and Shyna, A. (2019), An overview of deep learning based object detection techniques, 2019 1st International Conference on Innovations in Information and Communication Technology, 1-6.

[3] Zhang, C., Li, D., Qi, J., Liu, J., and Wang, Y. (2021), Infrared small target detection method with trajectory correction fuze based on infrared image sensor, Sensors, 21(13), 4522.

[4]	Arbelaez, P., Maire, M., Fowlkes, C., and Malik, J. (2010), Contour detection and hierarchical image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898-916.

[5]	M\"{u}nzer, B., Schoeffmann, K., and B\"{o}sz\"{o}rmenyi, L. (2018), Content-based processing and analysis of endoscopic images and videos: A survey, Multimedia Tools and Applications, 77, 1323-1362.

[6]	Tao, M.W., Su, J.C., Wang, T.C., Malik, J., and Ramamoorthi, R. (2015), Depth estimation and specular removal for glossy surfaces using point and line consistency with light-field cameras, IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(6), 1155-1169.

[7] Dai, P., Zhang, H., and Cao, X. (2019), Deep multi-scale context aware feature aggregation for curved scene text detection, IEEE Transactions on Multimedia, 22(8), 1969-1984.

[8]	Ren, X., Zhou, Y., He, J., Chen, K., Yang, X., and Sun, J. (2016), A convolutional neural network-based Chinese text detection algorithm via text structure modeling, IEEE Transactions on Multimedia, 19(3), 506-518.

[9]	Xue, M., Shivakumara, P., Zhang, C., Xiao, Y., Lu, T., Pal, U., Lopresti, D., and Yang, Z. (2020), Arbitrarily-oriented text detection in low light natural scene images, IEEE Transactions on Multimedia, 23, 2706-2720.

[10] Fu, G., Zhang, Q., Song, C., Lin, Q., and Xiao, C. (2019), Specular highlight removal for real-world images, Computer Graphics Forum, 253-263.
[11] Shafer, S.A. (1985), Using color to separate reflection components, Color Research $\&$ Application, 10(4), 210-218.
[12] Tan, R.T. and Ikeuchi, K. (2005), Separating reflection components of textured surfaces using a single image, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 178-193.
[13] Guo, J., Zhou, Z., and Wang, L. (2018), Single image highlight removal with a sparse and low-rank reflection model, Proceedings of the European Conference on Computer Vision, 268-283.
[14] Yang, Q., Tang, J., and Ahuja, N. (2014), Efficient and robust specular highlight removal, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(6), 1304-1311.

[15]	Lu, Q., Fauvet, E., Zakharova, A., and Laligant, O. (2015), Entire reflective object surface structure understanding based on reflection motion estimation, Pattern Recognition Letters, 68, 176-182.

[16]	Li, R., Pan, J., Si, Y., Yan, B., Hu, Y., and Qin, H. (2019), Specular reflections removal for endoscopic image sequences with adaptive-RPCA decomposition, IEEE Transactions on Medical Imaging, 39(2), 328-340.

[17]	Feng, W., Cheng, X., Li, X., Liu, Q., and Zhai, Z. (2024), Specular highlight removal based on dichromatic reflection model and priority-based adaptive direction with light field camera, Optics and Lasers in Engineering, 172, 107856.

[18]	Muhammad, S., Dailey, M.N., Farooq, M., Majeed, M.F., and Ekpanyapong, M. (2020), Spec-Net and Spec-CGAN: Deep learning models for specularity removal from faces, Image and Vision Computing, 93, 103823.

[19]	Shi, J., Dong, Y., Su, H., and Yu, S.X. (2017), Learning non-lambertian object intrinsics across shapenet categories, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1685-1694.

[20] Hu, G., Zheng, Y., Yan, H., Hua, G., and Yan, Y. (2022), Mask-guided cycle-GAN for specular highlight removal, Pattern Recognition Letters, 161, 108-114.

[21]	Shen, Z., Dang, H., Sun, M., and Zhou, X. (2019), Application of generating adversarial networks in high-light removal of wheel hub surface, 2019 International Conference on Advanced Mechatronic Systems, 12-15.

[22]	Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017), Image-to-image translation with conditional adversarial networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1125-1134.

[23]	Fu, G., Zhang, Q., Lin, Q., Zhu, L., and Xiao, C. (2020), Learning to detect specular highlights from real-world images, Proceedings of the 28th ACM International Conference on Multimedia, 1873-1881.

[24]	Fu, G., Zhang, Q., Zhu, L., Li, P., and Xiao, C. (2021), A multi-task network for joint specular highlight detection and removal, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7752-7761.

[25] Guo, S., Wang, X., Zhou, J., and Lian, Z. (2022), A fast specular highlight removal method for smooth liquor bottle surface combined with U2-Net and LaMa model, Sensors, 22(24), 9834.

[26]	Qin, X., Zhang, Z., Huang, C., Dehghan, M., Zaiane, O.R., and Jagersand, M. (2020), U2-Net: Going deeper with nested U-structure for salient object detection, Pattern Recognition, 106, 107404.

[27]

Suvorov, R., Logacheva, E., Mashikhin, A., Remizova, A., Ashukha, A., Silvestrov, A., Kong, N., Goka, H., Park, K., and Lempitsky, V. (2022), Resolution-robust large mask inpainting with Fourier convolutions, Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2149-2159.

[28]	Fu, G., Zhang, Q., Zhu, L., Xiao, C., and Li, P. (2023), Towards high-quality specular highlight removal by leveraging large-scale synthetic data, Proceedings of the IEEE/CVF International Conference on Computer Vision, 12857-12865.

[29]	Wu, Z., Guo, J., Zhuang, C., Xiao, J., Yan, D.M., and Zhang, X. (2023), Joint specular highlight detection and removal in single images via Unet-Transformer, Computational Visual Media, 9(1), 141-154.

[30]	Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., and Catanzaro, B. (2018), High-resolution image synthesis and semantic manipulation with conditional GANs, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8798-8807.

[31]	Johnson, J., Alahi, A., and Fei-Fei, L. (2016), Perceptual losses for real-time style transfer and super-resolution, Computer Vision--ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, 694-711.

[32] Simonyan, K. and Zisserman, A. (2014), Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556.

[33]	Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., and Berg, A.C. (2015), ImageNet large scale visual recognition challenge, International Journal of Computer Vision, 115, 211-252.