Electronics Component Identification from Voice

ABSTRAT:

This paper describes the architecture of a door phone embedded system with interactive voice response. Because speech technology is not 100% reliable, the emphasis was on parts that have greater impact on overall performance (audio capture, speech recognition and verification, and power consumption). Using an embedded microphone array increases speech recognition effectiveness in very noisy environments. To increase the speech recognition performance, a null grammar with confidence measure support was used. The speaker verification module was also optimized for nosy environments (using the cepstral mean normalization technique and a universal background mode

EXISTING SYSTEM :

A LAN switch with Power over Ethernet (PoE) function powers the door phone units. The PBX is controlled by the auto-attendant module, which conducts the dialog provided by the dialog manager including answering rules and greetings. All voice-based greetings and invitations by the auto-attendant module are based on prerecorded speech. The voice-based user recognition procedure is also started by the PBX auto-attendant module when the user is automatically guided through the identification and verification steps. The dialog’s user-recognition procedure involves speaker identification and speaker verification modules, whereas their output results are used by the auto- attendant module to take further actions concerning the dialog provided. The Session Initiation Protocol (SIP) was chosen for signaling and call setup, but other protocols such as H.323 could also be supported

EXISTING SYSTEM DISADVANTAGES:

1.LESS ACCURACY

2. LOW EFFICIENCY

PROPOSED SYSTEM :

the probability of hypothesis A and p(x|B) is the probability of hypothesis B. We implement a decision threshold defined as Ф to accept or deny the claimed speaker identity. If hypothesis B is probable the UBM speaker model is used to model a speaker other than the hypothesized speaker. In the GMM training process the training data are converted from raw 16-bit/16 kHz speech to Mel-frequency cepstral coefficients (MFCCs), which are then used for acoustic modeling and adaptation. To enhance speaker verification accuracy, MFCCs components are used together with their first-order derivates: the energy derivate and cepstral mean normalization (CMN). Individual speaker models are derived by maximum a priori (MAP) adaptation of the UBM model using the particular speaker’s speech data. Speech/non-speech segmentation is also used to limit likelihood calculation to speech frames with speaker-important information. This enhances the speaker verification accuracy and calculation performance

PROPOSED SYSTEM ADVANTAGES:

1.HIGH ACCURACY

2.HIGH EFFICIENCY

SYSTEM REQUIREMENTS
SOFTWARE REQUIREMENTS:
• Programming Language : Python
• Font End Technologies : TKInter/Web(HTML,CSS,JS)
• IDE : Jupyter/Spyder/VS Code
• Operating System : Windows 08/10

HARDWARE REQUIREMENTS:

 Processor : Core I3
 RAM Capacity : 2 GB
 Hard Disk : 250 GB
 Monitor : 15″ Color
 Mouse : 2 or 3 Button Mouse
 Key Board : Windows 08/10

For More Details of Project Document, PPT, Screenshots and Full Code
Call/WhatsApp – 9966645624
Email – info@srithub.com