ABSTRACT :
Vision impairment or blindness is one of the top ten disabilities in humans, and unfortunately, India is home to the world’s largest visually impaired population. In this study, we present a novel framework to assist the visually impaired in object detection and recognition, so that they can independently navigate, and be aware of their surroundings. The paper employs transfer learning on Single-Shot Detection (SSD) mechanism for object detection and classification, followed by recognition of human faces and currency notes, if detected, using Inception v3 model. SSD detector is trained on modified PASCAL VOC 2007 dataset, in which a new class is added, to enable the detection of currency as well. Furthermore, separate Inception v3 models are trained to recognize human faces and currency notes, thus making the framework scalable and adaptable according to the user preferences. Ultimately, the output from the framework can then be presented to the visually impaired person in audio format. Mean Accuracy and Precision (mAP) scores of standalone SSD detector of the added currency class was 67.8 percent, and testing accuracy of person and currency recognition of Inception v3 model were 92.5 and 90.2 percent respectively.
EXISTING SYSTEM :
Visually impaired people face a lot of difficulties in their lives. Recent statistics published by World Health Organization (WHO) in 2019 reveal that globally, around 2.2 billion individuals are affected by vision impairment. Detecting and recognizing common objects in the surroundings seem to be a herculean task for the visually impaired individuals. They rely either on other people, which makes the blind dependent on them, or, on their sense of touch and smell to detect objects, which is highly inaccurate and can be hazardous in some cases. The white cane is the most popular blind navigating device. This was further improved by adding ultrasonic and IR sensors to detect obstacles in the vicinity of the visually impaired user, and provide feedback in the form of vibration or sound. Though this approach was useful for the mobility of the visually impaired user, it provided little or no information about the surroundings. For the user to have a better understanding of the surrounding, objection detection and classification, followed by recognition and audio feedback is crucial.
PROPOSED SYSTEM :
The proposed system consists of four subparts namely, the camera for inputting the image into the framework, object detection and classification module, the face and currency recognition modules, and finally, audio output to the visually impaired user as shown in Fig. 1. In this work, we focus on the design of the framework, which involves object detection, classification and recognition. SSD has shown faster single shot detection results for multiple categories [11]. PASCAL VOC 2007 dataset [12] has 20 classes of objects, containing 9,963 images having 24,640 annotated objects. This dataset was modified by adding more images in the training, validation and testing sets, making the total classes to 21, the new class being currency. This modified PASCAL VOC 2007 dataset is trained using transfer learning on SSD model. Input image is resized to 300 by 300 pixels. SSD300 model was initially loaded with weights from VGG-16 model. The framework makes a bounding box around any of the trained 21 categories if it exists in the input image. Intersection over Union (IOU) method shown in Fig. 2 is used to evaluate the performance. The name of any category other than human or currency, if detected, is fed into audio output. For human or currency classes, one more stage is required which is the recognition part.
SYSTEM REQUIREMENTS
SOFTWARE REQUIREMENTS:
• Programming Language : Python
• Font End Technologies : TKInter/Web(HTML,CSS,JS)
• IDE : Jupyter/Spyder/VS Code
• Operating System : Windows 08/10
HARDWARE REQUIREMENTS:
Processor : Core I3
RAM Capacity : 2 GB
Hard Disk : 250 GB
Monitor : 15″ Color
Mouse : 2 or 3 Button Mouse
Key Board : Windows 08/10