Detección de peatones en la noche usando Faster R-CNN e imágenes infrarrojas

Contenido principal del artículo

Resumen

En este artículo se presenta un sistema de detección de peatones en la noche, para aplicaciones en seguridad vehicular. Para este desarrollo se ha analizado el desempeño del algoritmo Faster R-CNN con imágenes en el infrarrojo lejano. Por lo que se constató que presenta inconvenientes a la hora de detectar peatones a larga distancia. En consecuencia, se presenta una nueva arquitectura Faster R-CNN dedicada a la detección en múltiples escalas, mediante dos generadores de regiones de interés (ROI) dedicados a peatones a corta y larga distancia, denominados RPNCD y RPNLD, respectivamente. Esta arquitectura ha sido comparada con los modelos para Faster R-CNN que han presentado los mejores resultados, como son VGG-16 y Resnet 101. Los resultados experimentales se han desarrollado sobre las bases de datos CVC-09 y LSIFIR, los cuales demostraron mejoras, especialmente en la detección de peatones a larga distancia, presentando una tasa de error versus FPPI de 16 % y sobre la curva Precisión vs. Recall un AP de 89,85 % para la clase peatón y un mAP de 90 % sobre el conjunto de pruebas de las bases de datos LSIFIR y CVC-09.

Detalles del artículo

Sección
Artículo Científico

Referencias

[1] D. König, M. Adam, C. Jarvers, G. Layher, H. Neumann, and M. Teutsch, “Fully convolutional region proposal networks for multispectral person detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), July 2017. doi: https://doi.org/10.1109/CVPRW.2017.36, pp. 243–250.
[2] D. Olmeda, C. Premebida, U. Nunes, J. M. Armingol, and A. de la Escalera, “Pedestrian detection in far infrared images,” Integrated Computer-Aided Engineering, vol. 20, no. 4, pp. 347–360, 2013. [Online]. Available: https://goo.gl/Rss9Qp
[3] WHO. (2004) World report on road traffic injury prevention. World Health Organization. [Online]. Available: https://goo.gl/PBhixd
[4] ANT. (2017) Siniestros octubre 2016. Agencia Nacional de Tránsito. Ecuador. [Online]. Available: https://goo.gl/GoXFX5
[5] ——. (2016) Siniestyros agosto 2017. Agencia Nacional de Tránsito. Ecuador. [Online]. Available: https://goo.gl/GoXFX5
[6] J. Li, X. Liang, S. Shen, T. Xu, and S. Yan, “Scale-aware fast R-CNN for pedestrian detection,” CoRR, 2015. [Online]. Available: https://goo.gl/27CMsz
[7] J. Yan, X. Zhang, Z. Lei, S. Liao, and S. Z. Li, “Robust multi-resolution pedestrian detection in traffic scenes,” in IEEE Conference on Computer Vision and Pattern Recognition, June 2013. doi: https://doi.org/10.1109/CVPR.2013.390, pp. 3033–3040.
[8] D. Guan, Y. Cao, J. Liang, Y. Cao, and M. Y. Yang, “Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection,” CoRR, 2018. [Online]. Available: https://goo.gl/AAWJFp
[9] J. Liu, S. Zhang, S. Wang, and D. N. Metaxas, “Multispectral deep neural networks for pedestrian detection,” CoRR, 2016. [Online]. Available: https://goo.gl/Czc6Jg
[10] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew, “Deep learning for visual understanding: A review,” Neurocomputing, vol. 187, pp. 27–48, 2016. doi: https://doi.org/10.1016/j.neucom.2015.09.116,
[11] L. Deng and D. Yu, “Deep learning: Methods and applications,” Foundations and Trends in Signal Processing, vol. 7, no. 3–4, pp. 197–387, 2014. doi: http://dx.doi.org/10.1561/2000000039.
[12] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in Neural Information Processing Systems 28. Curran Associates, Inc., 2015, pp. 91–99. [Online]. Available: https://goo.gl/5i64rm
[13] C. Ertler, H. Posseger, M. Optiz, and H. Bischof, “Pedestrian detection in rgb-d images from an elevated viewpoint,” in 22nd Computer Vision Winter Workshop, 2017. [Online]. Available: https://goo.gl/L4wB1e
[14] C. C. Pham and J. W. Jeon, “Robust object proposals re-ranking for object detection in autonomous driving using convolutional neural networks,” Signal Processing: Image Communication, vol. 53, pp. 110–122, 2017. doi: https://doi.org/10.1016/j.image.2017.02.007.
[15] X. Zhang, G. Chen, K. Saruta, and Y. Terata, “Deep convolutional neural networks for all-day pedestrian detection,” in Information Science and Applications 2017, K. Kim and N. Joukov, Eds. Singapore: Springer Singapore, 2017. doi: https://doi.org/10.1007/978-981-10-4154-9_21, pp. 171–178.
[16] Elektra, CVC-09: FIR Sequence Pedestrian Dataset, ElektraAutonomous Vehicle developed by CVC & UAB & UPC, 2016. [Online]. Available: https://goo.gl/NhYuZ2
[17] D. Olmeda, C. Premebida, U. Nunes, J. Armingol, and A. de la Escalera., “Lsi far infrared pedestrian dataset,” Universidad Carlos III de Madrid. España, 2013. [Online]. Available: https://goo.gl/pJTGvj
[18] D. Heo, E. Lee, and B. Chul Ko, “Pedestrian detection at night using deep neural networks y saliency maps,” Journal of Imaging Science and Technology, vol. 61, no. 6, pp. 60 403–1–60 403–9, 2017. doi: https://doi.org/10.2352/J.ImagingSci.Technol.2017.61.6.060403.
[19] C. Bingwen, W. Wenwei, and Q. Qianqing, “Robust multi-stage approach for the detection of moving target from infrared imagery,” Optical Engineering, vol. 51, no. 6, 2012. doi: https://doi.org/10.1117/1.OE.51.6.067006.
[20] V. John, S. Mita, Z. Liu, and B. Qi, “Pedestrian detection in thermal images using adaptive fuzzy c-means clustering and convolutional neural networks,” in 2015 14th IAPR International Conference on Machine Vision Applications (MVA), May 2015. doi: https://doi.org/10.1109/MVA. 2015.7153177, pp. 246–249.
[21] D. Kim and K. Lee, “Segment-based region of interest generation for pedestrian detection in far-infrared images,” Infrared Physics & Technology, vol. 61, pp. 120–128, 2013. doi: https://doi.org/10.1016/j.infrared.2013.08.001.
[22] J. Ge, Y. Luo, and G. Tei, “Real-time pedestrian detection and tracking at nighttime for driverassistance systems,” IEEE Transactions on Intelligent Transportation Systems, vol. 10, no. 2, pp. 283–298, June 2009. doi: https://doi.org/10.1109/TITS.2009.2018961.
[23] J. H. Kim, H. G. Hong, and K. R. Park, “Convolutional neural network-based human detection in nighttime images using visible light camera sensors,” Sensors, vol. 17, no. 5, pp. 1–26, 2017. doi: https://doi.org/10.3390/s17051065.
[24] B. Qi, V. John, Z. Liu, and S. Mita, “Pedestrian detection from thermal images with a scattered difference of directional gradients feature descriptor,” in 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), Oct 2014. doi: https://doi.org/10.1109/ITSC.2014.6958024, pp. 2168–2173.
[25] M. R. Jeong, J. Y. Kwak, J. E. Son, B. Ko, and J. Y. Nam, “Fast pedestrian detection using a night vision system for safety driving,” in 2014 11th International Conference on Computer Graphics, Imaging and Visualization, Aug 2014. doi: https://doi.org/10.1109/CGiV.2014.25, pp.69–72.
[26] J. Kim, J. Baek, and E. Kim, “A novel on-road vehicle detection method using hog,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 6, pp. 3414–3429, Dec 2015. doi: https://doi.org/10.1109/TITS.2015.2465296.
[27] K. Piniarski, P. Pawlowski, and A. D. abrowski, “Pedestrian detection by video processing in automotive night vision system,” in 2014 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Sept 2014, pp. 104–109. [Online]. Available: https://goo.gl/uxnD6X
[28] S. L. Chang, F. T. Yang, W. P. Wu, Y. A. Cho, and S. W. Chen, “Nighttime pedestrian detection using thermal imaging based on hog feature,” in Proceedings 2011 International Conference on System Science and Engineering, June 2011. doi: https://doi.org/10.1109/ICSSE.2011.5961992, pp. 694–698.
[29] H. Sun, C. Wang, and B. Wang, “Night vision pedestrian detection using a forward-looking infrared camera,” in 2011 International Workshop on Multi-Platform/Multi-Sensor Remote Sensing and Mapping, Jan 2011. doi: https://doi.org/10.1109/M2RSM.2011.5697384, pp. 1–4.
[30] P. Govardhan and U. C. Pati, “Nir image based pedestrian detection in night vision with cascade classification and validation,” in 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies, May 2014. doi: https://doi.org/10.1109/ICACCCT.2014.7019339, pp. 1435–1438.
[31] Y. Chun-he and D. Cai-Fang, “Research of the method of quickly finding the pedestrian area of interest,” Journal of Electrical and Electronic Engineering, vol. 5, no. 5, pp. 180–185, 2017. doi: http://doi.org/10.11648/j.jeee.20170505.14.
[32] J. Baek, J. Kim, and E. Kim, “Fast and efficient pedestrian detection via the cascade implementation of an additive kernel support vector machine,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 4, pp. 902–916, April 2017. doi. https://doi.org/10.1109/TITS.2016.2594816.
[33] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew, “Deep learning for visual understanding: A review,” Neurocomputing, vol. 187, pp. 27–48, 2016. doi: https://doi.org/10.1016/j.neucom.2015.09.116.
[34] H. A. Perlin and H. S. Lopes, “Extracting human attributes using a convolutional neural network approach,” Pattern Recognition Letters, vol. 68, pp. 250–259, 2015. doi: https://doi.org/10.1016/j.patrec.2015.07.012.
[35] P. Sermanet, K. Kavukcuoglu, S. Chintala, and Y. Lecun, “Pedestrian detection with unsupervised multi-stage feature learning,” in 2013 IEEE Conference on Computer Vision and Pattern Recognition, June 2013. doi: https://doi.org/10.1109/CVPR.2013.465, pp. 3626–3633.
[36] D. Ribeiro, J. C. Nascimento, A. Bernardino, and G. Carneiro, “Improving the performance of pedestrian detectors using convolutional learning,” Pattern Recognition, vol. 61, pp. 641–649, 2017. doi: https://doi.org/10.1016/j.patcog.2016.05.027.
[37] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. Lecun, “Overfeat: Integrated recognition, localization and detection using convolutional networks,” 12 2013. [Online]. Available: https://goo.gl/zNNUCd
[38] D. Tomè, F. Monti, L. Baroffio, L. Bondi, M. Tagliasacchi, and S. Tubaro, “Deep convolutional neural networks for pedestrian detection,” Signal Processing: Image Communication, vol. 47, pp. 482–489, 2016. doi: https://doi.org/10.1016/j.image.2016.05.007.
[39] J. Cao, Y. Pang, and X. Li, “Learning multilayer channel features for pedestrian detection,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3210–3220, July 2017. doi: https: //doi.org/10.1109/TIP.2017.2694224.
[40] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, June 2014. doi: https://doi.org/10.1109/CVPR.2014.81, pp. 580–587.
[41] R. Girshick, “Fast r-cnn,” in 2015 IEEE International Conference on Computer Vision (ICCV), Dec 2015. doi: https://doi.org/10.1109/ICCV.2015.169, pp. 1440–1448.
[42] L. Zhang, L. Lin, X. Liang, and K. He, “Is faster rcnn doing well for pedestrian detection?” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016. doi: https://doi.org/10.1007/978-3-319-46475-6_28, pp. 443–457.
[43] Z. Cai, Q. Fan, R. Feris, and N. Vasconcelos, “A unified multi-scale deep convolutional neural network for fast object detection,” 2016. [Online]. Available: https://goo.gl/Y4XNZv
[44] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in International Conference on Learning Representations, 2014. [Online]. Available: https://goo.gl/98akRT
[45] J. Konecný, J. Liu, P. Richtárik, and M. Takác, “Mini-batch semi-stochastic gradient descent in the proximal setting,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 2, pp. 242–255, March 2016. doi: https://doi.org/10.1109/JSTSP.2015.2505682.
[46] D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” in ICLR 2015, 2015. [Online]. Available: https://goo.gl/so1Da8
[47] P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: An evaluation of the state of the art,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 743–761, April 2012. doi: https://doi.org/10.1109/TPAMI.2011.155.