DEEP LEARNING BASED OBJECT DETECTION AND RECOGNITION FRAMEWORK FOR THE VISUALLY-IMPAIRED

ABSTRACT :

Vision impairment or blindness is one of the top ten disabilities in humans, and unfortunately, India is home to the world’s largest visually impaired population. In this study, we present a novel framework to assist the visually impaired in object detection and recognition, so that they can independently navigate, and be aware of their surroundings. The paper employs transfer learning on Single-Shot Detection (SSD) mechanism for object detection and classification, followed by recognition of human faces and currency notes, if detected, using Inception v3 model. SSD detector is trained on modified PASCAL VOC 2007 dataset, in which a new class is added, to enable the detection of currency as well. Furthermore, separate Inception v3 models are trained to recognize human faces and currency notes, thus making the framework scalable and adaptable according to the user preferences. Ultimately, the output from the framework can then be presented to the visually impaired person in audio format. Mean Accuracy and Precision (mAP) scores of standalone SSD detector of the added currency class was 67.8 percent, and testing accuracy of person and currency recognition of Inception v3 model were 92.5 and 90.2 percent respectively.

EXISTING SYSTEM :

Visually impaired people face a lot of difficulties in their lives. Recent statistics published by World Health Organization (WHO) in 2019 reveal that globally, around 2.2 billion individuals are affected by vision impairment. Detecting and recognizing common objects in the surroundings seem to be a herculean task for the visually impaired individuals. They rely either on other people, which makes the blind dependent on them, or, on their sense of touch and smell to detect objects, which is highly inaccurate and can be hazardous in some cases. The white cane is the most popular blind navigating device. This was further improved by adding ultrasonic and IR sensors to detect obstacles in the vicinity of the visually impaired user, and provide feedback in the form of vibration or sound. Though this approach was useful for the mobility of the visually impaired user, it provided little or no information about the surroundings. For the user to have a better understanding of the surrounding, objection detection and classification, followed by recognition and audio feedback is crucial.

PROPOSED SYSTEM :

The proposed system consists of four subparts namely, the camera for inputting the image into the framework, object detection and classification module, the face and currency recognition modules, and finally, audio output to the visually impaired user as shown in Fig. 1. In this work, we focus on the design of the framework, which involves object detection, classification and recognition. SSD has shown faster single shot detection results for multiple categories [11]. PASCAL VOC 2007 dataset [12] has 20 classes of objects, containing 9,963 images having 24,640 annotated objects. This dataset was modified by adding more images in the training, validation and testing sets, making the total classes to 21, the new class being currency. This modified PASCAL VOC 2007 dataset is trained using transfer learning on SSD model. Input image is resized to 300 by 300 pixels. SSD300 model was initially loaded with weights from VGG-16 model. The framework makes a bounding box around any of the trained 21 categories if it exists in the input image. Intersection over Union (IOU) method shown in Fig. 2 is used to evaluate the performance. The name of any category other than human or currency, if detected, is fed into audio output. For human or currency classes, one more stage is required which is the recognition part.

SYSTEM REQUIREMENTS
SOFTWARE REQUIREMENTS:
• Programming Language : Python
• Font End Technologies : TKInter/Web(HTML,CSS,JS)
• IDE : Jupyter/Spyder/VS Code
• Operating System : Windows 08/10

HARDWARE REQUIREMENTS:

 Processor : Core I3
 RAM Capacity : 2 GB
 Hard Disk : 250 GB
 Monitor : 15″ Color
 Mouse : 2 or 3 Button Mouse
 Key Board : Windows 08/10

For More Details of Project Document, PPT, Screenshots and Full Code
Call/WhatsApp – 9966645624
Email – info@srithub.com

Enquire Now

Leave your details here for more details.

Latest post

Telecalling Executive

August 9, 2024

Stock Price Prediction using Twitter Dataset

August 7, 2024

Price Negotiating Chatbot on E-commerce website

August 7, 2024

A Two-Stage Model to Predict Surgical Patient’s Lengths of Stay From an Electronic Patient Database

August 7, 2024

Identifying Bone Tumour using X-Ray Images

August 7, 2024

Spammer detection and fake user identification on social network

August 7, 2024

Team Work