Current biomolecular technologies yield extremely high-dimensional data, often involving thousands or even millions of features (e.g. gene expression measurements or SNPs). By contrast, a common hypothesis is that many biological processes only depend on a very small number of markers. Feature selection techniques are required to identify those biomarkers that are associated with certain phenotypes.
Our research focuses on the application of feature selection techniques in clinical settings as well as the development and evaluation of feature selection methods. In particular, we apply feature selection in combination with classifiers. This includes extensions of the Set Covering Machine, visualization and evaluation of feature subset stability in resampling settings.
L. Lausser, C. Müssel, M. Maucher, and H. A. Kestler. Measuring and visualizing the stability of biomarker selection techniques. Computational Statistics, 28(1):51–65, 2013.
H. A. Kestler, L. Lausser, W. Lindner, and G. Palm. On the fusion of threshold classifiers for categorization and dimensionality reduction. Computational Statistics, 26(2):321–340, 2011.
H. A. Kestler, W. Lindner, and A. Müller. Learning and feature selection using the set covering machine with data-dependent rays on gene expression profiles. In F. Schwenker and S. Marinai, editors, Artificial Neural Networks in Pattern Recognition (ANNPR 06), volume LNAI 4087, pages 286–297. Springer-Verlag, Heidelberg, 2006.