International Journal of Scientific & Engineering Research, Volume 4, Issue 12, December-2013
ISSN 2229-5518
580
Sign Language Recognition System
Mayuresh Keni, Shireen Meher, Aniket Marathe.
Abstract— The only way the speech and hearing impaired (i.e dumb and deaf) people can communicate is by sign language. The main problem of this way of communication is normal people who cannot understand sign language can’t communicate with these people or vice versa. Our project aims to bridge the gap between the speech and hearing impaired people and the normal people. The basic idea of this project is to make a system using which dumb people can significantly communicate with all other people using their normal gestures. The system does not require the background to be perfectly black. It works on any background. The project uses image processing system to identify, especially English alphabetic sign language used by the deaf people to communicate and converts them into text so that normal people can understand.
—————————— ——————————
Dumb people are usually deprived of normal communication with other people in the society. Also normal people find it difficult to understand and communicate with them. These people have to rely on an interpreter or on some sort of visual communication. An interpreter won’t be always available and visual
communication is mostly difficult to understand.
uses this system. Also the connecting wires restrict the freedom of movement.
This system was also implemented by using Image Processing. In this way of implementation the sign language recognition part was done by Image Processing instead of using Gloves. But the only problem this system had was the background was compulsorily to be black otherwise this system would not work. Also some of the
systems required color bands which were meant to be wore
IJSER
Sign Language is the primary means of communication in the deaf and dumb community. As a normal person is unaware of the grammar or meaning of various gestures that are part of a sign language, it is primarily limited to their families and/or deaf and dumb community.
At this age of technology, it is quintessential to make these people feel part of the society by helping them communicate smoothly. Hence, an intelligent computer system is required to be developed and be taught. Researchers have been attacking the problem for quite some time now and the results are showing some promise. Interesting technologies are being developed for speech recognition but no real commercial product for sign recognition is actually there in the current market.
The researches done in this field are mostly done using a glove based system. In the glove based system, sensors such as potentiometer, accelerometers etc. are attached to each of the finger. Based on their readings the corresponding alphabet is displayed. Christopher Lee and Yangsheng Xu developed a glove-based gesture recognition system that was able to recognize 14 of the letters from the hand alphabet, learn new gestures and able to update the model of each gesture in the system in online mode. Over the years advanced glove devices have been designed such as the Sayre Glove, Dexterous Hand Master and Power Glove.
The main problem faced by this gloved based system is
that it has to be recalibrate every time whenever a new user
on the finger-tips so that the fingerstips are identified by
the Image Processing unit.
We are implementing our project by using Image
Processing. The main advantage of our project is that it is not restricted to be used with black background. It can be used with any background. Also wearing of color bands is not required in our system.
In this paper we would present a robust and efficient method of sign language detection. Instead of using Datagloves for sign language detection, we would be doing the detection by image processing. The main advantage of using image processing over Datagloves is that the system is not required to be re-calibrated if a new user is using the system. Also by using a threshold value while converting the image from Grayscale to Binary form, this system can be used in any background and is not restricted to be used with Black or White Background.
IJSER © 2013
International Journal of Scientific & Engineering Research, Volume 4, Issue 12, December-2013
ISSN 2229-5518
581
The algorithm section shows the overall architecture and idea of the system. The image is captured using a webcam which is mounted on the shoulders of the speech and hearing impaired person. The image thus captured is sent to the computer which does processing on it explained below and displays the corresponding text. The image captures is in RGB form. This image is first converted into
Red, Green and Blue are the primary colors. Using these three colors all the other colors are made. The gesture or image captured through webcam is in the color or RGB form. This image cannot be directly use for comparison as the algorithm to compare two RGB images would be very difficult. Also we have to remove all the background from the captured image. For this we will be converting the image into Grayscale and then to binary. This is explained below [3].
The gesture captured through the webcam is in the RGB form. Those are converted into Grayscale. The image is converted into Grayscale because Grayscale gives only intensity information, varying from black at the weakest intensity to white at the strongest. Thus applying a threshold for converting it into binary image becomes much easier. These images are then easily converted into
binary image using thresholding [3].
Grayscale which is then conIvertedJinto binary fSorm. The X ER
and Y coordinate of the image are calculated from the
Binary form of the image. These coordinates are then
compared with the coordinates of the images existing in the database. A database of images is made previously by taking images of the gestures of the sign language. A corresponding Text is assign to the gestures. When the coordinates of the captured image match then the corresponding text is displayed.
The most important part of the project is the orientation of the camera. If the orientation of it is not done properly then this may lead to misinterpretation and the output will be wrong. Hence orientation of the camera should be done carefully. The camera is placed on the shoulders of the Speech and Hearing impaired (i.e. Dumb and Deaf) person. The camera will placed in such a way that it would be facing in the same direction as the user’s view.
The image capturing section handles just capturing the image and sending it to the image processing section which does the processing part of the project. The gesture captured through the webcam has to be properly processed so that it is ready to go through pattern matching algorithm. The image processing is done in the following ways.
Binary image is the image which consists of just two colors i.e White and Black or we can say just two Gray levels. It is important to convert the image into binary so that comparison of two images i.e. the captured image and the images present in the database will be easy. The Grayscale image is converted into binary image by applying a threshold. Thresholding is important to remove all the background and keeping just the hand in the image. A threshold is applied to the grayscale image and the Gray levels below the minimum value of the Threshold are converted into Black while the ones above the threshold are converted into White. The threshold value is selected such that is represents skin color in RGB form of image. Thus the system is not restricted with only black or white background and can work in any background [3].
In this section the input image which is converted into binary form is compared with the images present in the database. The binary images consist of just two gray levels and hence two images i.e. the captured image and the image present in the data base can be compared easily. Images in the database are also binary images. We use comparison algorithm to compare captured image with all images in database. Also, a single gesture is captured from more than 2 angles so that the accuracy of the system can be increase. Pixels of captured image are compared with pixels of images in database, if 90 percent of the pixel values are matched then we display the text on LCD, else image is
IJSER © 2013
International Journal of Scientific & Engineering Research, Volume 4, Issue 12, December-2013
ISSN 2229-5518
582
compared with next image in the database. This Process keeps on going till match is found. If no match is found then that image is discarded and next image is considered for pattern matching.
It is required to make a proper database of the gestures of the sign language so that the images captured while communicating using this system can be compared. For making the database, we would be capturing each gesture from more than 2 angles so that the accuracy of the system will be increase significantly. The more angles you take, the better is the accuracy and the more amount of memory is required. If the user’s hand alignment is different from the one stored in the database for the same gesture, then it would create an error in the system. This error is removed by taking pictures of same gesture from more than 2 angles.
In order to detect hand gestures, data about the hand
The output of the sign language will be displayed in the text form in real time. This makes the system more efficient and hence communication of the hearing and speech impaired people more easy. The images captured through web cam are compared and the result of comparison is displayed at the same time. Thus this feature of the system makes communication very simple and delay free.
When this entire project is implemented on Raspberry Pie computer, which is very small yet powerful computer, the entire system becomes portable and can be taken anywhere. This feature facilitates the user to take the system anywhere and everywhere and overcomes the barrier of restricting him/herself to communicate without a desktop or laptop.
In sign language recognition using sensors attached to
IJSER
will have to be collected. A decision has to be made as to
the nature and source of the data. Two possible
technologies to provide this information are:
- A glove with sensors attached that measure the position of the finger joints.
- An optical method.
An optical method has been chosen, since this is more practical (many modern computers come with a camera attached), cost effective and has no moving parts, so is less likely to be damaged through use.
The first step in any recognition system is collection of
relevant data. In this case the raw image information will have to be processed to differentiate the skin of the hand (and various markers) from the background.
Once the data has been collected it is then possible to use prior information about the hand (for example, the fingers are always separated from the wrist by the palm) to refine the data and remove as much noise as possible. This step is important because as the number of gestures to be distinguished increases the data collected has to be more and more accurate and noise free in order to permit recognition. The next step will be to take the refined data and determine what gesture it represents. Any recognition system will have to simplify the data to allow calculation in a reasonable amount of time. Obvious ways to simplify the data include translating, rotating and scaling the hand so that it is always presented with the same position, orientation and effective hand-camera distance to the recognition system.
the hands, the system needs to be calibrated every time the user is changed according to the hand of the user. But this is not the case when we implement the system using Image Processing. The output depends on the angles on the fingers and the wrist rather than size of hand.
As no special sensors are used in this system, the system is less likely to get damaged.
In future work, proposed system can be developed and implemented using Raspberry Pi. Image Processing part should be improved so that System would be able to communicate in both directions i.e.it should be capable of converting normal language to sign language and vice versa. We will try to recognize signs which include motion. Moreover we will focus on converting the sequence of gestures into text i.e. word and sentences and then converting it into the speech which can be heard.
Our project aims to make communication simpler between deaf and dumb people by introducing Computer in communication path so that sign language can be automatically captured, recognized, translated to text and displayed it on LCD. There are various methods for sign language conversion. Some of them use wired electronic glove and others use visual based approach. Electronic
IJSER © 2013
International Journal of Scientific & Engineering Research, Volume 4, Issue 12, December-2013
ISSN 2229-5518
583
gloves are costly and one person cannot use the glove of other person. In vision based approach, different techniques are used to recognize and match the captured gestures with gestures in database. Converting RGB image to binary and matching it with database using a comparing algorithm is simple, efficient and robust technique. This technique is sufficiently accurate to convert sign language into text.
The authors would like to thank Mrs. Amruta Chintawar, Assistant professor at Electronics department, Ramrao Adik Institute of Technology for her spirited Guidance and moral support. We thank all faculty members and staff of Electronics Department and those who contributed directly or indirectly to this work. We are thankful to Mr. Abhijeet Kadam, Assistant professor at Electronics Department, Ramrao Adik Institue of Technology for his guidance in writing this research paper.
IJSER
[1]Ms. Rashmi D. Kyatanavar, Prof. P. R. Futane, Comparative Study
of Sign Language Recognition Systems, International Journal of
Scientific and Research Publications, Volume 2, Issue 6, June 2012 1
ISSN 2250-3153
[2]Ravikiran J, Kavi Mahesh, Suhas Mahishi, Dheeraj R, Sudheender S, Nitin V Pujari, Finger Detection for Sign Language Recognition, Proceedings of the International MultiConference of Engineers and Computer Scientists 2009 Vol I IMECS 2009, March 18 - 20, 2009, Hong Kong.
[3]Rafael C. Gonzalez, Richard E. Woods.Digital Image Processing. Pearson (2008).
IJSER © 2013