Skip to content

SuryaPrasad-Vuppalapati/proj-cancer_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

proj-cancer_detection

Abstract

Cancer has identified a diverse condition of several various subtypes. The timely screening and course of treatment of a cancer form is now a requirement in early cancer research because it supports the medical treatment of patients. Many research teams studied the application of ML and Deep Learning methods in the field of biomedicine and bioinformatics in the classification of people with cancer across high- or low-risk categories. These techniques have therefore been used as a model for the development and treatment of cancer. As, it is important that ML instruments are capable of detecting key features from complex datasets. Many of these methods are widely used for the development of predictive models for predicating a cure for cancer, some of the methods are artificial neural networks (ANNs), support vector machine (SVMs) and decision trees (DTs). While we can understand cancer progression with the use of ML methods, an adequate validity level is needed to take these methods into consideration in clinical practice every day. In this study, the ML & DL approaches used in cancer progression modeling are reviewed. The predictions addressed are mostly linked to specific ML, input, and data samples supervision.

Topology of machine learning & deep learning algorithms

In order to predict the various types of diseases, different deep learning & machine learning algorithms are used , such as Support vector machine (SVM), Neural Network (NN), LR, Nevin biases (NB), Fuzzy logic, transfer learning, ensemble learning, Transduction learning, KNN, and Adaboost are mostly utilized in diverse contributions. Moreover, SVM is categorized into Boosted SVM & MLSVM for predicting distinct diseases in the earlier contributions. Similarly, NN is classified as Dynamic Neural Network (DNN) & Convolution Neural Network (CNN) which are employed for diagnosing different diseases in different contributions. Moreover, GBDT is a modified form of DT, CVIFLR is the modified form of LR that are used for detecting diseases. Moreover, RF and Fuzzy logic is grouped into HRFLM and Fuzzy SVM, respectively in order to predict discrete diseases in various contributions. So, for predicting lung cancer in an efficient manner with the help of improved machine learning techniques can be use.

Case study 1

Application of machine learning to predict the susceptibility of cancer risk from the 79 papers surveyed in this study are relative limited (only 3). The development of a retrospective methodology to predict the presence of 'spontaneous' breast cancer using single nucleotide polymorphism (SNP) steroid metabolizing enzymes (CYP 450) is among the interesting documents. Close. Sporadic and non-family breast cancers account for 90% (Dumitrescu and Cotarla, 2015). This trial was conducted with the theory that environmental toxins or hormones were accumulated in breast tissue and that some combinations of the SNP gene were at increased risk of breast cancer. The authors have obtained data on 63 breast cancer patients and 74 breast-free (controls) patients from the SNP (98 SNP from 45 cancer-associated Genes). It was vital to the progress of this research that researches used various methods to minimize a sample-per-feature ratio and analyzed several processes of machine training in order to find optimum classification. In particular, the authors rapidly reduced this set from a start set of 98 SNPs to just 2–3 SNPs that appeared to be as informational as possible. Instead of almost 3:2 (with all the 98 SNPs used), the specimen ratios were reduced to 45:1 (for 3 SNPs) and 68:1 (for 2 SNPs). This made it possible to prevent the “dimensionality curse” from being affected (Bellman, 1961; Somorjai et al., 2013). When the testing sample gets minimized, a number of machine learning techniques, consisting of a naïve Bayes model, various decision-making methods and a sophisticated SVM were applied. With just a set of three SNPs the SVM and naïve Bayes classifier were maximum in precision. The decision-tab classifier achieved maximum accuracy with a set of two SNPs. The SVM classification was the optimum, along with a precision of 69%, and 67% and 68%, respectively, were found in the naive Bayes and Decision Tree classification systems. The outputs are about 23–25% better than original. The extensive level of cross validation and confirmation conducted was another notable feature of this study. At least three ways have been validated for each model's predictive power. Firstly, model training with 20-fold cross validation has been evaluated and monitored. A bootstrap resampling approach was used when the cross validation is performed 5 times and the outputs were averaged to keep the stochastic dimension in the division of samples to a minimum. In addition, the selection process was carried out for 100 times in each fold (5 times for each of 20 folds) in order to reduce unequality in function selection (i.e. selecting the most informative SNP sub-ensemble). Thus, the outputs are then matched with an altered permutation test that, had 50 percent predictive precision. While the researchers tried to reduce the stochastics in sample partitioning, it could have been better to use leave-one-out cross-validation that shall have completely deleted this stochastical element. This trial was conducted with the theory that environmental toxins or hormones were accumulated in breast tissue and that some combinations of the SNP gene were at increased risk of breast cancer. The authors have obtained data on 63 breast cancer patients and 74 breast-free (controls) patients from the SNP (98 SNP from 45 cancer-associated Genes). It was vital to the progress of this research that researchers used various models to minimize a sample-per-feature ratio and analyzed several methods of machine training in order to find optimum classification. It also points out the wayin which machine learning can disclose significant information into the biology of spontaneous or non-famile breast cancer and polygenic risk factors.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published