Veuillez utiliser cette adresse pour citer ce document : https://zone.biblio.laurentian.ca/handle/10219/3670
Titre: Improved prediction of gene expression of epigenomics data of lung cancer using machine learning and deep learning models
Auteurs: Shi, ZhengXin
Mots clés: Epigenomics;deep learning;histone modification;DNA methylation;RNA-sequencing;feature selection;classification
Date publié: 26-fév-2020
Abstrait: Epigenetics is the study of biological mechanisms that will switch genes on and off, its alterations are deeply involved in the change of gene expression among various diseases including cancers. Machine learning is frequently used in cancer diagnosis and detection. In this research, four types of data are used towards the correct prediction of lung cancer, including DNA Methylation data, Histone data, Human Genome data, and RNA-Seq data. Four feature selection methods - ReliefF, Gain Ratio (GR), Principle Component Analysis (PCA), Correlation-based feature selection (CFS) and seven different classifiers - Random Forest (RF), Support Vector Machine (SVM) with Gaussian Kernel function and Linear Kernel function, Logistic Regression (LR), Naive Bayes (NB), Artificial Neural Network, and Convolutional Neural Network (CNN) were implemented in this study. The processing of these data sets is done using custom R-script. The tools that were used for feature selection and classification in the presented work are Weka 3 and Python. With the help of machine learning and deep learning methods, we were able to improve the accuracy and area under the curve (AUC) of the lung cancer prediction from an earlier published work. It was observed that the CNN model overperformed the other six classification methods.
URI: https://zone.biblio.laurentian.ca/handle/10219/3670
Apparaît dans les collections:Computational Sciences - Master's theses
Master's Theses

Fichiers dans cet item:
Fichier Description TailleFormat 
Zhengxin Shi - Thesis FINAL.pdf1.56 MBAdobe PDFThumbnail
Parcourir/Ouvrir


Tous les documents dans DSpace sont protégés par copyright, avec tous droits réservés.