BREAST CANCER DETECTION

ABSTRACT

Cancer is one of the menacing and unpredictable disease. If it is not detected in its first stage then it could endanger the person life. Similarly according to Breast Cancer Institute (BCI), Breast Cancer is one of the most dangerous type of diseases that is very effective for women in the world. For detecting breast cancer mostly machine learning techniques are used. In this project we propose an adaptive ensemble voting technique for diagnosed breast cancer using Wisconsin Breast cancer database. The main objective of this work is to explain how CNN and logistic regression , Support vector machine(SVM), K-nearest neighbour(KNN) algorithm provides better solution when it works with ensemble machine learning algorithms for predicting the breast cancer. When compared to related work from the literature. It is shown that the CNN approach achieves 94.12% accuracy from another machine learning algorithm.

EXISTING SYSTEM

Generally there are two type of tumours. One is benign and other is malignant tumour in which benign Tumour is non-cancerous and malignant is a cancer Tumour. There are various methods and algorithms that are available for detecting the breast cancer such Support vector machine (SVM), Naïve Bayes, KNN and ANN etc. ANN is a deep learning technique which is generally used to predict the continuous as well as non-continuous data. Before the Artificial neural network (ANN) is applied some pre processing on the data is required in order to get a good accuracy. On the dataset firstly feature selection process using recursive feature elimination is carried out and then top 16 features are selected out and then the ANN is applied on it.

DISADVANTAGES

Hardware dependence: Artificial neural networks require processors with parallel processing power, in accordance with their structure. For this reason, the realization of the equipment is dependent.
Unexplained behaviour of the network: This is the most important problem of ANN. When ANN produces a probing solution, it does not give a clue as to why and how. This reduces trust in the network.
Determination of proper network structure: There is no specific rule for determining the structure of artificial neural networks. Appropriate network structure is achieved through experience and trial and error.
Difficulty of showing the problem to the network: ANNs can work with numerical information. Problems have to be translated into numerical values before being introduced to ANN. The display mechanism to be determined here will directly influence the performance of the network. This depends on the user’s ability.
The duration of the network is unknown: The network is reduced to a certain value of the error on the sample means that the training has been completed. This value does not give us optimum results.

PROPOSED SYSTEM

The proposed system is working on various algorithms and based on that it propose that which algorithm is best to predict the breast cancer. The proposed system is using support vector machine (SVM), K-Nearest Neighbour (KNN), Logistic Regression and Convolutional Neural Network (CNN). CNN is a deep learning technique which process on the images and finds the best features of the images and can be used to predict a categorical data. This a powerful technique which can be used in various domains. Generally neural networks consist of individual units called neurons. Neurons are located in a series of groups — layers. Neurons in each layer are connected to neurons of the next layer. Data comes from the input layer to the output layer along these compounds. Each individual node performs a simple mathematical calculation. Then it transmits its data to all the nodes it is connected to. Convolutional neural networks (CNN) is a special architecture of artificial neural networks. CNN uses some features of the visual cortex. One of the most popular uses of this architecture is image classification. For example Facebook uses CNN for automatic tagging algorithms. Computer sees the image as an array of pixels. For example, if image size is 300 x 300. In this case, the size of the array will be 300x300x3. Where 300 is width, next 300 is height and 3 is RGB channel values. The computer is assigned a value from 0 to 255 to each of these numbers. This value describes the intensity of the pixel at each point. To solve this problem the computer looks for the characteristics of the base level. In human understanding such characteristics are for example the trunk or large ears. For the computer, these characteristics are boundaries or curvatures. And then through the groups of convolutional layers the computer constructs more abstract concepts. The Convolution layer is always the first. The image (matrix with pixel values) is entered into it. Imagine that the reading of the input matrix begins at the top left of image. Next the software selects a smaller matrix there, which is called a filter (or neuron, or core). Then the filter produces convolution, i.e. moves along the input image. The filter’s task is to multiply its values by the original pixel values. All these multiplications are summed up. One number is obtained in the end. Since the filter has read the image only in the upper left corner, it moves further and further right by 1 unit performing a similar operation. After passing the filter across all positions, a matrix is obtained, but smaller than an input matrix. The network will consist of several convolutional networks mixed with nonlinear and pooling layers. When the image passes through one convolution layer, the output of the first layer becomes the input for the second layer. And this happens with every further convolutional layer. The nonlinear layer is added after each convolution operation. It has an activation function, which brings nonlinear property. Without this property a network would not be sufficiently intense and will not be able to model the response variable (as a class label). The pooling layer follows the nonlinear layer. It works with width and height of the image and performs a down sampling operation on them. As a result the image volume is reduced. This means that if some features (as for example boundaries) have already been identified in the previous convolution operation, than a detailed image is no longer needed for further processing, and it is compressed to less detailed pictures. After completion of series of convolutional, nonlinear and pooling layers, it is necessary to attach awfully connected layer. This layer takes the output information from convolutional networks. Attaching a fully connected layer to the end of the network results in an N dimensional vector, where N is the amount of classes from which the model selects the desired class.

ADVANTAGES

The usage of CNNs are motivated by the fact that they can capture / are able to learn relevant features from an image /video at different levels similar to a human brain. This is feature learning.
In terms of performance, CNNs outperform NNs on conventional image recognition tasks and many other tasks.
For a completely new task / problem CNNs are very good feature extractors. This means that we can extract useful attributes from an already trained CNN with its trained weights by feeding your data on each level and tune the CNN a bit for the specific task.
E.g. : Add a classifier after the last layer with labels specific to the task. This is also called pre-training and CNNs are very efficient in such tasks compared to NNs.

SYSTEM REQUIREMENTS
SOFTWARE REQUIREMENTS:
• Programming Language : Python
• Font End Technologies : TKInter/Web(HTML,CSS,JS)
• IDE : Jupyter/Spyder/VS Code
• Operating System : Windows 08/10

HARDWARE REQUIREMENTS:

 Processor : Core I3
 RAM Capacity : 2 GB
 Hard Disk : 250 GB
 Monitor : 15″ Color
 Mouse : 2 or 3 Button Mouse
 Key Board : Windows 08/10

For More Details of Project Document, PPT, Screenshots and Full Code
Call/WhatsApp – 9966645624
Email – info@srithub.com

Enquire Now

Leave your details here for more details.

Latest post

Telecalling Executive

August 9, 2024

Stock Price Prediction using Twitter Dataset

August 7, 2024

Price Negotiating Chatbot on E-commerce website

August 7, 2024

A Two-Stage Model to Predict Surgical Patient’s Lengths of Stay From an Electronic Patient Database

August 7, 2024

Identifying Bone Tumour using X-Ray Images

August 7, 2024

Spammer detection and fake user identification on social network

August 7, 2024

Team Work