Comparative Study of Markerless Vision-Based Gait Analyses for Person Re-Identification
Abstract
:1. Introduction
2. Related Work
2.1. Gait Analysis
2.2. Person Re-Identification
2.3. 2D and 3D Key Joints Estimation
3. Method and Problem Formulation
3.1. 2D Pose Estimation from 2D Video Images
3.2. 3D Pose Estimation from 2D Pose Keypoints
- The joint index numbers of the COCO dataset are as follows: 0-nose, 1-neck, 2-right shoulder, 3-right elbow, 4-right wrist, 5-left shoulder, 6-left elbow, 7-left wrist, 8-right hip, 9-right knee, 10-right ankle, 11-left hip, 12-left knee, 13-left ankle, 14-right eye, 15-left eye, 16-right ear, 17-left ear.
- The join index numbers of H3.6M dataset are as follows: 0-midpoint of the hips, 1-right hip, 2-right knee, 3-right ankle (foot), 6-left hip, 7-left knee, 8-left ankle, 12-spine, 13-thorax, 14-neck/nose, 15-head, 17-left shoulder, 18-left elbow, 19-left wrist, 25-right shoulder, 26-right elbow, 27-right wrist.
3.3. Gait Features
3.4. Gait Cycle Extraction
3.4.1. Interpolation and Smoothing
3.4.2. Getting Local Maxima and Minima
- Create a moving window. Let N be the window size.
- From the center point, calculate the slopes of all left points.
- From the center point, calculate the slopes of all right points.
- If all left slopes are positive and all right slopes are negative, then the center point is a local maximum.
- If all left slopes are negative and all right slopes are positive, then the center point is a local minimum.
3.5. Feature-Based Approach
3.5.1. Dynamic Gait Features
- The angles of the upper leg (thigh) relative to the vertical.
- The angles of the lower leg (calf) relative to the upper leg (thigh).
- The angles of the ankle relative to the horizontal.
- The means and standard deviations of the horizontal and vertical distances between the feet and knees and between the knees and shoulders.
- The mean areas of the triangle of the root () and two feet.
- The step length: the maximum distance between two feet.
- The gait cycle time: from (a) to (h) in Figure 5.
- The gait velocity: two times of a step length is divided by a gait cycle.
3.5.2. Anthropometric Static Features
3.5.3. Gait Feature Sets
- max_Rdegree: the maximum right knee flexion
- max_Ldegree: the maximum left knee flexion)
- min_Rdegree: the minimum right knee flexion
- min_Ldegree: the minimum left knee flexion
- initial_contact_hip_extension
- initial_contact_left_knee_extension
- initial_contact_left_leg_inclination
- initial_swing_knee_flextion
- mid_stance_knee_flextion
- terminal_stance_hip_extension
- terminal_stance_right_knee_flextion
- terminal_stance_right_leg_inclination
- terminal_stance_left_leg_inclination
- terminal_swing_hip_extension
- terminal_swing_right_leg_inclination
- upper_body:
- right_lower_leg: the right calf w.r.t
- right_upper_leg: the right thigh w.r.t
- left_lower_leg: the left calf w.r.t
- left_upper_leg: the left thigh w.r.t
- left_stride =
- right_stride =
- RFoot_period: frames where the right foot is ahead of the left.
- LFoot_period: frames where the left foot is ahead of the right.
- period: the total frames in a gait cycle.
3.6. Spatiotemporal-Based Approach
- Left ankle, knee, and hip: , ,
- Right ankle, knee, and hip: , ,
3.7. Classification
3.7.1. Feature-Based Approach
3.7.2. Spatiotemporal-Based Approach
3.8. Datasets
4. Experimental Results
4.1. Training Ensemble Methods
4.2. Training Siamese-LSTM Network
4.3. Classification Performance Comparison
5. Discussion
5.1. Feature Study
5.2. Further Thoughts on Feature-Based Approach
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Nixon, M.S.; Bouchrika, I.; Arbab-Zavar, B.; Carter, J.N. On use of biometrics in forensics: Gait and ear. In Proceedings of the 2010 18th European Signal Processing Conference, Aalborg, Denmark, 23–27 August 2010; pp. 1655–1659. [Google Scholar]
- Liu, Z.; Zhang, Z.; Wu, Q.; Wang, Y. Enhancing person re-identification by integrating gait biometric. Neurocomputing 2015, 168, 1144–1156. [Google Scholar] [CrossRef]
- Cuntoor, K.R.; Kale, A.; Rajagopalan, A.N.; Cuntoor, N.; Krüger, V. Gait-based Recognition of Humans Using Continuous HMMs. In Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, Washinton, DC, USA, 21 May 2002; pp. 321–326. [Google Scholar]
- Wang, L.; Tan, T.; Ning, H.; Hu, W. Silhouette Analysis-Based Gait Recognition for Human Identification. IEEE Trans. Pattern Anal. Mach. Intell. 2003, 25, 1505–1518. [Google Scholar] [CrossRef] [Green Version]
- Larsen, P.K.; Simonsen, E.B.; Lynnerup, N. Gait Analysis in Forensic Medicine. J. Forensic Sci. 2008, 53, 1149–1153. [Google Scholar] [CrossRef] [PubMed]
- Nixon, M.S.; Carter, J.N. Automatic Recognition by Gait. Proc. IEEE 2006, 94, 2013–2024. [Google Scholar] [CrossRef] [Green Version]
- Han, J.; Bhanu, B. Individual recognition using gait energy image. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 316–322. [Google Scholar] [CrossRef] [PubMed]
- Colyer, S.L.; Evans, M.; Cosker, D.P.; Salo, A.I.T. A Review of the Evolution of Vision-Based Motion Analysis and the Integration of Advanced Computer Vision Methods Towards Developing a Markerless System. Sport. Med.—Open 2018, 4, 24. [Google Scholar] [CrossRef] [Green Version]
- Latorre, J.; Colomer, C.; Alcañiz, M.; Llorens, R. Gait analysis with the Kinect v2: Normative study with healthy individuals and comprehensive study of its sensitivity, validity, and reliability in individuals with stroke. J. Neuroeng. Rehabil. 2019, 16, 97. [Google Scholar] [CrossRef] [Green Version]
- Andersson, V.O.; Araújo, R.M. Person Identification Using Anthropometric and Gait Data from Kinect Sensor. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
- Sinha, A.; Chakravarty, K.; Bhowmick, B. Person Identification using Skeleton Information from Kinect. In Proceedings of the Sixth International Conference on Advances in Computer-Human Interactions, Nice, France, 24 February–1 March 2013; pp. 101–108. [Google Scholar]
- Ahmed, F.; Paul, P.P.; Gavrilova, M.L. DTW-based kernel and rank-level fusion for 3D gait recognition using Kinect. Vis. Comput. 2015, 31, 915–924. [Google Scholar] [CrossRef]
- Jiang, S.; Wang, Y.; Zhang, Y.; Sun, J. Real Time Gait Recognition System Based on Kinect Skeleton Feature. In Computer Vision—ACCV 2014 Workshops; Jawahar, C., Shan, S., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2015; pp. 46–57. [Google Scholar] [CrossRef]
- Sun, J.; Wang, Y.; Li, J.; Wan, W.; Cheng, D.; Zhang, H. View-invariant gait recognition based on kinect skeleton feature. Multimed. Tools Appl. 2018, 77, 24909–24935. [Google Scholar] [CrossRef]
- Yang, F. Kinematics Research Progress of Swim-start on the New Start Block. Phys. Act. Health 2018, 2, 15–21. [Google Scholar] [CrossRef] [Green Version]
- Yao, L.; Kusakunniran, W.; Wu, Q.; Zhang, J.; Tang, Z.; Yang, W. Robust gait recognition using hybrid descriptors based on Skeleton Gait Energy Image. Pattern Recognit. Lett. 2021, 150, 289–296. [Google Scholar] [CrossRef]
- Das Choudhury, S.; Tjahjadi, T. Robust view-invariant multiscale gait recognition. Pattern Recognit. 2015, 48, 798–811. [Google Scholar] [CrossRef] [Green Version]
- Zeng, W.; Wang, C. View-invariant gait recognition via deterministic learning. In Proceedings of the 2014 International Joint Conference on Neural Networks (IJCNN), Beijing, China, 6–11 July 2014; pp. 3465–3472. [Google Scholar] [CrossRef]
- Bouchrika, I.; Nixon, M.S. Model-Based Feature Extraction for Gait Analysis and Recognition. In Computer Vision/Computer Graphics Collaboration Techniques; Gagalowicz, A., Philips, W., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2007; pp. 150–160. [Google Scholar] [CrossRef] [Green Version]
- Krzeszowski, T.; Switonski, A.; Kwolek, B.; Josinski, H.; Wojciechowski, K. DTW-Based Gait Recognition from Recovered 3-D Joint Angles and Inter-ankle Distance. In Computer Vision and Graphics; Chmielewski, L.J., Kozera, R., Shin, B.S., Wojciechowski, K., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2014; pp. 356–363. [Google Scholar] [CrossRef]
- Zheng, L.; Yang, Y.; Hauptmann, A.G. Person Re-identification: Past, Present and Future. arXiv 2016, arXiv:1610.02984. [Google Scholar]
- Gray, D.; Brennan, S.; Tao, H. Evaluating appearance models for recognition, reacquisition, and tracking. In Proceedings of the IEEE International Workshop on Performance Evaluation for Tracking and Surveillance, Rio de Janeiro, Brazil, 14 October 2007. [Google Scholar]
- Andriluka, M.; Pishchulin, L.; Gehler, P.V.; Schiele, B. 2D Human Pose Estimation: New Benchmark and State of the Art Analysis. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar] [CrossRef]
- Cao, Z.; Hidalgo Martinez, G.; Simon, T.; Wei, S.E.; Sheikh, Y.A. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 172–186. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Newell, A.; Yang, K.; Deng, J. Stacked Hourglass Networks for Human Pose Estimation. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 483–499. [Google Scholar] [CrossRef] [Green Version]
- Pishchulin, L.; Insafutdinov, E.; Tang, S.; Andres, B.; Andriluka, M.; Gehler, P.; Schiele, B. DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 4929–4937. [Google Scholar] [CrossRef] [Green Version]
- Insafutdinov, E.; Pishchulin, L.; Andres, B.; Andriluka, M.; Schiele, B. DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model. In Computer Vision—ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2016; pp. 34–50. [Google Scholar] [CrossRef] [Green Version]
- Papandreou, G.; Zhu, T.; Kanazawa, N.; Toshev, A.; Tompson, J.; Bregler, C.; Murphy, K. Towards Accurate Multi-person Pose Estimation in the Wild. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Mehta, D.; Sotnychenko, O.; Mueller, F.; Xu, W.; Sridhar, S.; Pons-Moll, G.; Theobalt, C. Single-Shot Multi-Person 3D Body Pose Estimation From Monocular RGB Input. arXiv 2017, arXiv:1712.03453. [Google Scholar]
- Taylor, C. Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image. Comput. Vis. Image Underst. 2000, 80, 349–363. [Google Scholar] [CrossRef] [Green Version]
- Iqbal, U.; Doering, A.; Yasin, H.; Krüger, B.; Weber, A.; Gall, J. A dual-source approach for 3D human pose estimation from single images. Comput. Vis. Image Underst. 2018, 172, 37–49. [Google Scholar] [CrossRef]
- Martinez, J.; Hossain, R.; Romero, J.; Little, J.J. A Simple Yet Effective Baseline for 3d Human Pose Estimation. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2659–2668. [Google Scholar] [CrossRef] [Green Version]
- Wan, Q.; Zhang, W.; Xue, X. DeepSkeleton: Skeleton Map for 3D Human Pose Regression. arXiv 2017, arXiv:1711.10796. [Google Scholar]
- Zhou, X.; Leonardos, S.; Hu, X.; Daniilidis, K. 3D shape estimation from 2D landmarks: A convex relaxation approach. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 July 2015; pp. 4447–4455. [Google Scholar] [CrossRef] [Green Version]
- Rogez, G.; Weinzaepfel, P.; Schmid, C. LCR-Net++: Multi-person 2D and 3D Pose Detection in Natural Images. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 1146–1161. [Google Scholar] [CrossRef] [Green Version]
- Tome, D.; Russell, C.; Agapito, L. Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5689–5698. [Google Scholar] [CrossRef] [Green Version]
- Kudo, Y.; Ogaki, K.; Matsui, Y.; Odagiri, Y. Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations. arXiv 2018, arXiv:1803.08244. [Google Scholar]
- Chen, C.; Ramanan, D. 3D Human Pose Estimation = 2D Pose Estimation + Matching. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5759–5767. [Google Scholar] [CrossRef] [Green Version]
- Pezoa, F.; Reutter, J.L.; Suarez, F.; Ugarte, M.; Vrgoč, D. Foundations of JSON schema. In Proceedings of the 25th International Conference on World Wide Web, Montréal, QC, Canada, 11–15 April 2016; pp. 263–273. [Google Scholar]
- Ionescu, C.; Papava, D.; Olaru, V.; Sminchisescu, C. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 1325–1339. [Google Scholar] [CrossRef]
- Balazia, M.; Sojka, P. Gait Recognition from Motion Capture Data. ACM Trans. Multimed. Comput. Commun. Appl. 2018, 14, 22:1–22:18. [Google Scholar] [CrossRef] [Green Version]
- Ball, A.; Rye, D.; Ramos, F.; Velonaki, M. Unsupervised clustering of people from ‘skeleton’ data. In Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction, Boston, MA, USA, 5–8 March 2012; Association for Computing Machinery: Boston, MA, USA, 2012; pp. 225–226. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef] [Green Version]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018; Curran Associates Inc.: Red Hook, NY, USA, 2018; pp. 6639–6649. [Google Scholar]
- Koch, G.; Zemel, R.; Salakhutdinov, R. Siamese neural networks for one-shot image recognition. In Proceedings of the ICML Deep Learning Workshop, Lille, France, 6–11 July 2015; Volume 2. [Google Scholar]
- Mueller, J.; Thyagarajan, A. Siamese recurrent architectures for learning sentence similarity. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; AAAI Press: Phoenix, AZ, USA, 2016; pp. 2786–2792. [Google Scholar]
- Zheng, L.; Bie, Z.; Sun, Y.; Wang, J.; Su, C.; Wang, S.; Tian, Q. MARS: A Video Benchmark for Large-Scale Person Re-Identification. In Proceedings of the 14th European Conference on Computer Vision—ECCV; Amsterdam, The Netherlands, 8–16 October 2016. [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Van der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Sun, D.; Fekete, G.; Mei, Q.; Gu, Y. The effect of walking speed on the foot inter-segment kinematics, ground reaction forces and lower limb joint moments. PeerJ 2018, 6, e5517. [Google Scholar] [CrossRef]
Random Forest | XGBoost | CATBoost | Siamese | ||||
---|---|---|---|---|---|---|---|
Acc. | F1 | Acc. | F1 | Acc. | F1 | Acc. | |
MARS | 81% | 80% | 82% | 79% | 83% | 80% | 99% |
CASIA-A | 32% | 46% | 41% | 40% | 34% | 30% | 95% |
Random Forest | XGBoost | CATBoost | |
---|---|---|---|
1 | period | left_stride | period |
2 | right_stride | right_stride | right_stride |
3 | left_stride | period | left_upper_leg |
4 | Lfoof_period | left_upper_leg | left_stride |
5 | Rfoot_period | terminal_swing_ hip_extension | terminal_stance_ hip_extension |
6 | right_upper_leg | terminal_stance_ hip_extension | max_Ldegree |
7 | min_Ldegree | left_lower_leg | Lfoot_period |
8 | terminal_swing_ hip_extension | max_Ldegree | main_foot |
9 | min_Ldegree | left_lower_leg | Lfoot_period |
10 | terminal_stance_ hip_extension | right_upper_leg | terminal_swing_ hip_extension |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kwon, J.; Lee, Y.; Lee, J. Comparative Study of Markerless Vision-Based Gait Analyses for Person Re-Identification. Sensors 2021, 21, 8208. https://doi.org/10.3390/s21248208
Kwon J, Lee Y, Lee J. Comparative Study of Markerless Vision-Based Gait Analyses for Person Re-Identification. Sensors. 2021; 21(24):8208. https://doi.org/10.3390/s21248208
Chicago/Turabian StyleKwon, Jaerock, Yunju Lee, and Jehyung Lee. 2021. "Comparative Study of Markerless Vision-Based Gait Analyses for Person Re-Identification" Sensors 21, no. 24: 8208. https://doi.org/10.3390/s21248208
APA StyleKwon, J., Lee, Y., & Lee, J. (2021). Comparative Study of Markerless Vision-Based Gait Analyses for Person Re-Identification. Sensors, 21(24), 8208. https://doi.org/10.3390/s21248208