Abstract
Introduction: Sign language is the only way to communicate for speech-impaired people. But this sign language is not known to normal people so this is a barrier in communication. This is the problem faced by people with speech impairments or disorder. In this paper, we have presented a system which captures hand gestures with a Kinect camera and classifies the hand gesture into its correct symbol.
Methods: We used the Kinect camera, not the ordinary web camera, because the ordinary camera does not capture its 3d orientation or depth of an image; however, Kinect camera can capture 3d image and this will make the classification more accurate.
Results: Kinect camera produces a different image for hand gestures for ‘2’ and ‘V’ and similarly for ‘1’ and ‘I’; however, a simple web camera cannot distinguish between these two. We used hand gestures for Indian sign language and our dataset contained 46339, RGB images and 46339 depth images. 80% of the total images were used for training and the remaining 20% for testing. In total, 36 hand gestures were considered to capture alphabets and alphabets ranged from A-Z and 10 for numerics.
Conclusion: Along with real-time implementation, we have also shown the comparison of the performance of various machine learning models in which we found that CNN working on depth- images has more accuracy than other models. All these resulted were obtained on the PYNQ Z2 board.
Discussion: We performed labeling of the data set, training, and classification on PYNQ Z2 FPGA board for static images using SVM, logistic regression, KNN, multilayer perceptron, and random forestalgorithms. For this experiment, we used our own 4 different datasets of ISL alphabets prepared in our lab. We analyzed both RGB images and depth images.
Keywords: Computer vision, kinect camera, PYNQ-Z2, sign language, depth images, hand gestures
Graphical Abstract
[http://dx.doi.org/10.1007/978-94-015-8935-2_10]
[http://dx.doi.org/10.1109/ICIMU.2014.7066641]
[http://dx.doi.org/10.14569/IJACSA.2013.040228]
[http://dx.doi.org/10.1109/ICMLC.2013.6890413]
[http://dx.doi.org/10.1109/CRV.2014.20]
[http://dx.doi.org/10.1145/1553374.1553453]
[http://dx.doi.org/10.1109/ICIP.2014.7025171]
[http://dx.doi.org/10.1109/FUZZ-IEEE.2014.6891777]
[http://dx.doi.org/10.1109/ICPR.2018.8545502]
[http://dx.doi.org/10.1109/APSIPA.2016.7820723]
[http://dx.doi.org/10.1007/s10489-012-0403-7]
[http://dx.doi.org/10.1016/j.patrec.2013.04.007]
[http://dx.doi.org/10.1109/83.661196] [PMID: 18276266]
[http://dx.doi.org/10.3390/s18103554] [PMID: 30347776]