The use of Machine Learning in predicting the next character of a Genome sequence.
Bojang, Tetlo Morebodi
Cardiff Metropolitan University
MetadataShow full item record
The use of machine learning and other computational methods in life sciences has seen a tremendous growth over the past years. In genomics, the classifiers are often used to analyse DNA sequences to identify gene coding areas and protein binding sites. In this dissertation, machine learning classifiers are used to analyse genome dataset of the Baker’s yeast; Saccharomyces cerevisiae. The Python programming language is used to prepare the data and put together classifier models. The performance of the models is evaluated on how they can correctly predict class labels from patterns in the given data. The evaluation of findings shows that there is an opportunity to further improve feature selection in this area of study to obtain better results.