HAND GESTURE RECOGNITION SYSTEMByStudents NameKainat Sajid Registration # 022Students NameFatima BaigRegistration # 055Students NameHajira BibiRegistration # 056A Project report submitted in partial fulfillmentof the requirement for the degree ofBachelors in Computer ScienceDepartment of Computer ScienceFatima Jinnah Women UniversityThe Mall, Rawalpindi2014-2018CERTIFICATEIt is certified that the contents and form of thesis entitled ” Hand Gesture Recognition System ” submitted by Fatima Baig (Reg# 05), Kainat Sajid (Reg# 022), Hajira Bibi (Reg# 07) have been found satisfactory for the requirement of the degree.Supervisor: ______________________________( Ms.
Nousheen Saba )Co-Supervisor: ______________________________(Ms. Fakeeha jafari)DEDICATIONTo Allah the Almighty&To my Parents and FacultyACKNOWLEDGEMENTSI am deeply thankful to my supervisor Ms. Nousheen Saba & Ms. Fakeeha Jafari for helping me throughout the course in accomplishing my final project. Their guidance, support anf motivation enabled me in achieving the objectives of the project.Chapter # 1 INTRODUCTIONProject Overview Human Computer interaction (HCI) is an interesting and active area of research.
The basic goal of human computer interaction is to improve the interaction between user and computer. Researchers have made a great program in this field and and the human computer interaction more flexible. The human computer interaction is not just limited with a personal computer to keyboard and mouse. Interaction between computer and human can be done by different sensory modes like speech, gesture, facial,and body expressions. Researchers are trying to meke the interaction more and more easy and efficient.
Many well known researchers have contributed alot in this field. The disabled people like those who can't speak can easily communicate through speech recognition system. There is no limitation in interaction with computer. The interaction between computer and users can be made more and more efficient and flexible by the type of systems like hand gesture recognition system. The project is a part of human computer interaction. In this project we have tried to show how a system can be made more flexible for the users to easily access the system.
Although the current method which are used to interact with the computers are sufficient for most of the human purposes, but some of them are too much expensive and everyone can't use it easily. For this reason, the project's main goal is to manage some applications of mouse with hand gesture rather than pointing amd clicking a mouse or touching a display directly. While using a keypad of the laptop, there may be too many problems like it may get conk or the mouse of a computer may get conk or fail. This problem can be avoided by using hand gesture recognition system in which a normal webcam or depth camera(type of camera i.e expensive can be used for hand gesture recognition system). By using this system, the interaction becomes easy and problems of getting conk of of mouse or keypad can be prevented easily. Basically this desktop application uses hand gestures captured by the webcam and perfoms some of the operations of mouse.
For example, in VLC media player it controls the volume button. In PDF the cursor can be moved up,down, left and right with help of fingers movement. In PowerPoint presentations slideshow can be controlled by hand gestures. The camera is positioned in such a way that it recognizes the movement of fingers by using convexity defects algorithm. Basically it performs mouse operations but this desktop application is trying to make itself economically cheap as compared to other softwares so everyone can afford it. It provides effective means of nonverbal communication with the human and computer.
The system can be used to control the mouse in presentations,in media player and in PDF. 1.2 Project VisionHand gesture recognition technology is one the methods used in sign language for nonverbal communication. It is most commonly used by deaf and dumb people who have hearing or speech problems to communicate among themselves or normal people can also used it easily.
As the laptop keypad is used instead of mouse but in this project a new technology is induced which performs some of the operations of mouse by using hand gestures. It can replace some of keys e.g up down buttons og VLC media player, PowerPoint slideshow and movement of cursor in PDF.There is alot of work been done in gesture recognition technology and different methods have been used to recognize the gestures. In this project convexity defect algorithm is selected to make the operations work.
In previous methods the gestures were first classii, stored in the database system and the feature matching was performed and it took pretty much time. This project has the benefit that it takes micro seconds in capturing the hand gesture and detects more easily. Chapter 2 LITERATURE REVIEWLiterature Review:Many applications have been developed which are controlled by hand gestures like gaming, sign language recognition, mouse controller, slide sharing, media player etc and for this purpose many methods have been introduced by several researchers most of them are not good enough for real time implementation. Some of them are discussed here.
Ruize Xu, Shengli Zhou and Wen J. Li developed an approach in 20111 which recognizes seven hand gestures like up, down, right, left cross and circle and for this purpose three different modules were built that recognized hand gestures.Kuan-Ching Li, Hwei-Jen Lin, Sheng-Yu Peng, Kanoksak Wattanachote in 20112 gave an approach,in which hand movements were used to store information from internet which is convenient to use.Ginu Thomas in 20123 presented an article on an analysis of various hand gestures recognition methods, in which he compared the results obtained by different methods.N.
Krishna Chaitanya and R. Janardhan Rao presented “Controlling of windows media player application using hand gesture recognition” in 20144, this system uses various hand gestures as input to operate the windows media player application. This system only supports windows media player application but not other applications like adobe photo shop vlc etc.Viraj Shinde, Tushar Bacchav, Jitendra Pawar and Mangesh Sanap developed “Hand Gesture Recognition System Using Camera” in 20145. They focused on using pointing behaviors for a natural interface to classify the dynamic hand gestures, they developed a simple and fast motion history image based method.It is applicable only for the application of power point presentation.
Swapnil D. Badgujar, proposed the system “Hand Gesture Recognition System” which recognize the unknown input gestures by using hand tracking and extraction method in 20146.This system is applied to recognize the single gesture.There is an assumption of stationary background so that system will have smaller search region for tracking.
This system only controls mouse with the finger using it on web cam.Mahmoud Elmezain, Ayoub Al-Hamadi, Jorg Appenrodt, Bernd7Michael proposed a hand gesture recognition system in 2008 that recognizes both isolated and continuous gestures for Arabic numbers (0-9) in real-time from stereo color image sequences by the motion trajectory of a single hand using HMM.The purpose was to improve gesture recognition in natural conversation.This requires techniques for skin segmentation and handling occlusion between hands and faces to overcome the difficulties of overlapping regions.Anupam Agrawal and Siddharth Swarup Rautaray proposed a system8, “A Vision based Hand Gestures Interface for Operating VLC Media Player Application system” in 2010, in that the K-nearest neighbor algorithm has been used to recognize the various gestures.VLC media player features that were operated by hand gestures includes play, and pause, Full screen, stop, increase volume, and decrease volume.Chong Wang, proposed the system “Super pixel-Based Hand Gesture Recognition with Kinect Depth Camera” which uses Kinect depth camera in 20159.
It is based on a compact representation in the form of super pixels, which efficiently capture the shape, texture and depth features of the gestures. Since this system uses Kinect depth camera.This system was not that much appreciated because it the camera which was use in this system was expensive.
Ram Rajesh J., Sudharshan R., Nagarjunan D.
and Aarthi R. developed the system, “Remotely controlled PowerPoint presentation navigation using hand gestures” in 2012 10in which slides of power point presentation are controlled.In this system the developer used the segmentation algorithm for hand detection. Researchers made Improvements with the passage of time in every system but every system has some defects in it.Ram rajesh introduced a method to control powerpoint but it has a defect in it while capturing gestures.If the fingers are not stretched properly while making a gesture, then application did not work properly.
Pandit et al.11 developed hand gesture recognition system in 2011.This system requires data gloves with markers from which hand sample could be extracted.Chapter 3 SYSTEM REQUIREMENTS AND DESIGN.
1.1` Software RequirementsThis section enlightens the specific functional and non-functional requirements of the software i.e. conceptual models, project limitations, usage of case diagrams, case descriptions related to hand gesture recognition system.
These statements are in accordance with the same format as given in “Software Requirements and Specifications” document.3.1.
1.2 Functional Requirements (FR)In Functional requirements section the principal technical functionalities and specifications are discussed that can and should incorporate with the system.REQ-1: Skin Detection ModuleThis software is capable to do skin color detection and chalk out all the objects that lack skin color. Human skin detection manages the realization of skin shaded pixels and realm in a given picture.
Skin colouration is regularly utilized as a part of human skin detection because it is invariant to size and orientation and is quick to process. The three basic parameters for realizing a skin pixel are YCbCr (Luminance, Chrominance) , HSV (Hue, Saturation,Value) and RGB (Red, Green, Blue) color models. Skin detection module can be accomplished by utilizing YCbCr in light of the fact it is among the most color spaces that isolates color from intensity in much effective path as apposed to RGB or HSV.REQ-2: Filtered Object DetectionOnce a gesture has been filtered by removing its unnecessary parts of the picture , using the skin detection module, the software can read and sense “clusters” in the picture of skin colored objects also named as “blobs”.REQ-3: Hand DetectionThis product utilizes a hand detection system to screen out hand motions from the video capturing device i.
e webcam. By applying this hand detection framework, the framework can sense the hand and recognize the hand gestures. The hand detection can be done by contour extraction method. Contour plays an important role in image processing for detection of object and recognition. The contour is drawn around the white blob that is fundamentally the grayscale picture that we have sectioned by applying threshold value on the info picture and that contour is considered for additionally handling indicating that it is the contour of the hand. That is the way the product detects hand.
REQ-4: Mouse Movement Gesture Control ModeThe blobs are read and recognized and the software uses the blob to point out the mouse cursor. The mouse must follow abruptly on the screen, parallel to the movement of user.18.104.22.168 Software ConstraintsIn spite of the fact that the product utilizes different strategies for filtration and hand detection, the framework’s working varies under changing surrounding light.
It cannot ensure revise execution under serious splendid light source , as immediate bright light / daylight out of the sight does through windows. These components impact the calculation as the light dims the skin color required for detection, in this way making the skin detector inefficient for precise to detection. Indeed even in consistent lighting conditions each time the framework is utilized the framework might fail depending upon the end user’s hand color. If the end user’s hand is likewise darker in color, the framework won’t not have the capacity to isolate the end user’s hands and the dull background.
Sometimes the background pixels close the hand may likewise get included in the white blob Also if the background is not always dull, a few areas of the background may likewise include in with the hand in the white blob at certain threshold values and still make just a white blob. That is despite the fact that it would pass the condition that only white blob is available yet the white blob would comprise of the hand and the lighter background areas that are associated with the hand.3.1.
1.4 Non-Functional RequirementNon-functional requirements specify the criteria in the operation and the design of the framework.Efficiency in ComputationThis framework shall minimize the use of CPU and memory resources on the operating system. When HGR(Hand Gesture Recognition) is executing, the software shall utilize less than 80% of the system’s CPU resource and less than 100 megabytes of system memory.ExtensibilityThe software shall be extensible to support future developments and add-on to the HGR(Hand Gesture Recognition) software. The gesture control module of HGR(Hand Gesture Recognition) shall be at least 50% extensible to allow new gesture recognition features to be added to the system.
PortabilityThe HGR(Hand Gesture Recognition) software shall be 100% portable to all operating platforms. Therefore, this software should not depend on the different operating systems.PerformanceCommunication between end user and system shall be sufficiently quick for effective performance. This is conceivable just on the off chance that we limit the quantity of computations expected to perform image processing and hand recognition. Each captured video frame shall be processed within 350 milliseconds to accomplish 3 frames for each second performance.Detection and recognition performance must not moderate than 34 microseconds for every frame else it won’t prove as much efficient.The speed at which gesture is recognized shall not exceed up to the predetermined range.
ReliabilityThe HGR(Hand Gesture Recognition) software shall be operable in all lighting conditions. Despite of the brightness level in user’s operating environment, the program shall dependably recognize user’s hands. Each time same threshold value shall not be utilized as it may prove unreliable for various lightning conditions. Dynamic Color based thresholding ought to be done as this strategy demonstrates more precise in segmentation of hand in any case if the intensity of the hand doesn’t changes while utilizing.UsabilityThis software shall be easy to use for all users with minimal instructions. 100% of the languages on the graphical user interface (GUI) shall be intuitive and understandable by non-technical users.3.
2 Software Essentials?64 bit operating system windows 8 or 10 plus?Open CV 2.4.9 and?Windows frameworks with windows admin’s permissions are required for proper functioning of few parts of program.3.3 Hardware Requirements?A webcam is necessary3.
4 Environment Specification?A clear background is good for better results?There must not be objects (esp. skin colored objects) in front of Webcam other than just ‘palm’.3.5 System Design 3.5.
1 System Architecture Fig 1: System Architecture3.5.2 Sequence Diagram Fig 2: Sequence Diagram 1Fig 3: Sequence Diagram 2Fig 4: Sequence Diagram 33.5.3 Activity Diagram Fig 5: Activity Diagram 3.5.4 Use case DiagramFig 6: Systems’ Use_case DiagramUse Case Descriptions2.
3.1 Use Case:Use_Case 1: Start WebcamPrimary Actor: The UserGoal Level: To turn cam onSuccess End Condition: User Successfully turn the cam onPrecondition: NoneTrigger: User point and then click on start buttonUse_Case 2: Capture ImageActor: End user & CameraGoal Level: to give gesture and to take a picture of the end user.Overview: Camera will grab a picture of the user that will be processed to extract useful Information.Use_Case 3: Image ProcessingActor: Camera & SystemGoal Level: Process the image captured by WebcamUse_Case 4: Blob Detection & RecognitionActor: SystemGoal Level: Is to detect a blob and recognize gesturesUse_Case 5: Display ResultGoal Level: Is to perform required actions of user.3.5.5 Class diagramFig 7: Class Diagram Chapter 4 METHODOLOGY AND IMPLEMENTATION 4.
1 Methodology:In this project webcam has been used to capture the hand gestures.These gestures are then filtered out and some important tasks are performed which will be discussed in the following section.The methods which are important while using the system are; Skin Detection Contour extractionGesture recognition and finger countingThese steps are described below.4.1.2 Skin Detection :In this project computer vision and emgu CV has been used.
Compuetr vision is the tool which is used for image processing and emgu CV is a wrapper of computer vision.Some of the libraries of Aforge.net have also been used in computer vision.The initial step is to caputre the gesture of the hand through webcam. After capturing the image the skin colour is detected.
The question arises here that how can we detect the skin colour of the user?The skin is detected through YCrCb. It is a family of colour spaces used in photography systems and it is used because of its advantage of high phosphor emission characteristics and high signal noise ratio. In this colour space Y indicates luminance and r is red difference chroma component and b is blue difference chroma component. We have applied ycrcb function and ycrcb has min and max values i.e 0 to 255.
There is a slider/bar in gui through this the values of ycrcb can be minimized and maximized. Suppose if a person’s skin colour is white it detects this colour and it could be within the given range 0-255. If the skin colour is not within this range it exits.But ycrcb gives benefit that it can detects the skin colour and almost all the skin colours lie within this range.
The function that we have used in this project for skin colour detection is ycrcb and it has its own colour codes.Though these codes the skin colour is detected.By default the values of ycrcb are set at zero.
Firstly the skin detector function is applied and ycrcb min and max values are adjusted.This is how the values are obtained.We have used in-range function which gives the values of ycrcb in the range 0-255.
Within this range all the skin colour codes lie and the skin colour of the hand is detected. Then erosion function is applied cvinvoke.cv erode which removes the noise from the image and it can be applied more than once.
Then cvinvoke.dilate fucntion is applied again. If there is noise after errosion ,the dilation function removes the noise and after this skin colour is detected. After detection of skin colour the biggest blob of the hand is detected.
Although there could be many objects in the surrounding while capturing the image e.g the user’s face but it would detect only the biggest blob. The following code is used to extract the biggest blob. 4. 1.
3 Contour extraction :Contour plays an important role in image processing for detection of object and recognition. Contour is the boundary of an image. Here the boundary is of the object i.e the user’s hand. In this project contours are used to detect and recognize image from background. Contours are those curves which connect continuous points of a digital image and the points are of same colour. It acts as an outline of the object.
The biggest blob of the hand lies within this contour. Contour extraction is important because it is used to find convexity hull and convexity defects which is discussed in the given section. Given below is the contour of the hand gesture.
The contour pixels are numbered sequentially. Lets suppose that there are N number of points in the contour then we can find out the local contour sequence then h(i)=(xi,yi),i=1,2,3…….N, is the ith contour pixel.
h(i) is the ith contour sample and it can be computed by using the euclidean distance between h(i) and chord connecting the two endpoints hi-(w-1)/2 and hi+(w-1)/2 and window size(w) centered on h(i),as it is ; Where h(i)= |(ui)/(vi)|It is the local local contour sequence computed for N number of points in the contour. Array of h(i) is represented by H(i) which can be written as;Why do we need to find local contour sequence?Local contour sequence has many advantages that is why it is computed.It is computed because it doesnot depend upon the complexity of the image. It is suitable for the image having concave and convex contour. Sometimes the users hand is not parallel to camera but the partial gesture which is obtained is not affected.
It becomes visible due to LCS and it can represent the partial contour.Increasing the size of w(window) the local contour sequence amplitude also increases. There is a direct relation between the two.
The signal to noise ratio for a fixed contour noise level can can be increased by increasing the amplitude and robustness can be increased with respect to contour noise level. Local contour sequence for any arbitrary pixel is calculated as the perpendicular distance from the chord connecting the end points of the window size(w). 4.1.4 Gesture recognition and finger counting :The contour is extracted as it is discussed above.After the extraction of contour the convexity defects are calculated . Convexity defect means an area that don’t belong to the object but it is located inside the outer boundary.
It is the difference between convex hull and contour. With the help of convexity defects the finger count is computed and through the finger count the operations are performed. The given figure shows the convexity defects.In this figure p1 is the start contour point and p2 is the end contour point and p3 is the concave contour point. The farthest distance between p1 and p2 is the depth of the convexity defects I.e d2.
The white areas from 1-6 are all convex defects. Convexity hull plays an important role in computing the convexity defects. Convexity hull can be defined as the polygon which is surrounded by the vertices of the hand gesture contour as shown in the fig given below.-Red curve is the convex hull of the hand gesture and b is also called the hull as it is extracted from fig a.
Convexity defects are used to count the fingers and the finger count is used to perform various operations in VLC,PDF and powerpoint presentations.As the finger count four is used to increase the volume of VLC media player.Chapter 5 TESTINGGUI OF HAND GESTURE RECOGNITION SYSTEMFig 1.1: AppsSetting: fig 1.2Fig 1.3: LaunchChapter 6 CONCLUSION AND RECOMMENDATIONSConclusion and recommendationsA new technique has been proposed to increase the flexibility of the hand gesture recognition system.
We have implemented a real time version and used a simple camera. In this project we have not used any special hardware beyond a video camera input. The system is able to control mouse tasks by capturing and detecting user's hand. The project works well under different conditions but it has a constraint that in direct sunlight and in more bright environment the gestures couldn't be recognized properly.
This system performs some the functionalities of mouse e.g the volume of the VLC player can be increased or decreased by using hand gesture recognition system. In PDF the cursor can be moved up and down and left and right and in powerpoint presentations the slide show can be performed. In future improvements can be done in this system and an replace the physical mouse functionalities using the hand gestures.