Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language

ABSTRAT:

NLP (Natural Language Processing) is a technology that enables computers to understand human languages. Deep-level grammatical and semantic analysis usually uses words as the basic unit, and word segmentation is usually the primary task of NLP. In order to solve the practical problem of huge structural differences between different data modalities in a multi-modal environment and traditional machine learning methods cannot be directly applied, this paper introduces the feature extraction method of deep learning and applies the ideas of deep learning to multi-modal feature extraction. This paper proposes a multi-modal neural network. For each mode, there is a multilayer sub-neural network with an independent structure corresponding to it. It is used to convert the features in different modes to the same-modal features. In terms of word segmentation processing, in view of the problems that existing word segmentation methods can hardly guarantee long-term dependency of text semantics and long training prediction time, a hybrid network English word segmentation processing method is proposed. This method applies BI-GRU (Bidirectional Gated Recurrent Unit) to English word segmentation, and uses the CRF (Conditional Random Field) model to annotate sentences in sequence, effectively solving the long-distance dependency of text semantics, shortening network training and predicted time. Experiments show that the processing effect of this method on word segmentation is similar to that of BI-LSTM-CRF (Bidirectional- Long Short Term Memory-Conditional Random Field) model, but the average predicted processing speed is 1.94 times that of BI-LSTM-CRF, effectively improving the efficiency of word segmentation processing.

EXISTING SYSTEM :

When processing English text information, words are the most basic unit. Generally speaking, the research content in terms of semantics and grammar, etc., must use words as the smallest unit. Words are the central problem of text information mining. With the help of a computer, the words in the text content are segmented one by one to analyze the text content. Therefore, automatic word segmentation of text is the premise of research and analysis such as automatic information retrieval, automatic information extraction, and natural language understanding. Since there are spaces between English words, the basic method of word segmentation in English text is introduce

EXISTING SYSTEM DISADVANTAGES:

1.LESS ACCURACY

2. LOW EFFICIENCY

PROPOSED SYSTEM :

As shown in the Figure 1, the overall framework of the model which we proposed has a tree structure. It has been divided into two main parts, the one part is the root network, another is the upper layer network. The BP (Back Propagation) algo- rithm is used in the parameter training of the root network. The function loss defined in the auxiliary bridge layer is shared by each sub-network and is used to adjust each sub- network simultaneously.

PROPOSED SYSTEM ADVANTAGES:

1.HIGH ACCURACY

2.HIGH EFFICIENCY

SYSTEM REQUIREMENTS
SOFTWARE REQUIREMENTS:
• Programming Language : Python
• Font End Technologies : TKInter/Web(HTML,CSS,JS)
• IDE : Jupyter/Spyder/VS Code
• Operating System : Windows 08/10

HARDWARE REQUIREMENTS:

 Processor : Core I3
 RAM Capacity : 2 GB
 Hard Disk : 250 GB
 Monitor : 15″ Color
 Mouse : 2 or 3 Button Mouse
 Key Board : Windows 08/10

For More Details of Project Document, PPT, Screenshots and Full Code
Call/WhatsApp – 9966645624
Email – info@srithub.com