In this study, we aim to develop an automated system
for robust and reliable cancer diagnoses based on gene microarray data.
Amongst various utilized statistical classifiers, support vector machines
outperform other popular classifiers, such as K nearest neighbours, naive
Bayes, neural networks and decision tree, often to a remarkable degree.
We choose a set of 9 publicly available benchmark microarray datasets
that encompass both binary and multi-class cancer problems. Results of
comparative studies are provided, demonstrating that effective feature
selection is essential to the development of classifiers intended for
use in gene-based cancer classification. In particular, amongst various
systematic experiments carried out, best classification model is achieved
using a subset of features chosen via information gain feature ranking
for support vector machine classifier.