A review of data mining in bioinformatics

Limo, Vincent

A review of data mining in bioinformatics

Limo, Vincent (2019)

Avaa tiedosto

Limo_Vincent.pdf (609.4Kt)

Lataukset:

Limo, Vincent

2019

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202001091160

Tiivistelmä

In the beginning of the 20th century, commonly known as the information age, there has been a phe-nomenal growth of potentially deadly group of abnormal diseases such as cancer. Because of this, there is a need for cancer and other biomedical research in and around transcriptomics, genomic and genetics which have a direct application of computer science methods such as data analysis and mathematics.

The aim of this bachelor’s thesis is to highlight and discuss in detail the application of data mining techniques in bioinformatics. It begins by discussing the interdisciplinary relationship between data mining, knowledge discovery and bioinformatics before a comprehensive descriptive research in data mining techniques and their application in bioinformatics. The results stablished that gene expression analysis and gene sequencing rely on the application of clustering techniques such hierarchical, fuzzy, graph, and distance clustering while classification techniques, such as machine vector learning, super-vised learning, support vector machine and random forest are fundamental in genomic and proteomic synthesizing. It recommends data transformation, cleaning, and scalable statistical models as solutions to the prominent data quality and computational challenges in data mining. This thesis is divided into four main parts, Introduction, Data mining, Application of data mining in bioinformatics and a conclusion.

Kokoelmat

Opinnäytetyöt (Avoin kokoelma)