Medical Systems Biology

You are here:  Research > Bioinformatics > Clustering

Clustering

When analyzing biomolecular data, researchers are often confronted with questions such as "How many groups are in my data?" or "How robust is the identified grouping?"

Typical cluster analysis workflow


Cluster analysis provides the mathematical and algorithmic fundamentals for identifying groups of similar objects. However, performance issues quickly arise when analyzing typical data sets comprising thousands of samples and up to millions of features. We research the adaptation of clustering approaches to high-dimensional biomolecular data, including parallel cluster algorithms. Furthermore, we develop new methods for the evaluation of clusterings with respect to their stability, such as the combination of multiple cluster validation indices.

 

Selected publications

 

J. Kraus, L. Lausser, and H. A. Kestler. Exhaustive k-nearest-neighbour subspace clustering. Journal of Statistical Computation and Simulation, 85(1):30–46, 2015.

J. M. Kraus, C. Müssel, G. Palm, and H. A. Kestler. Multi-objective selection for collecting cluster alternatives. Computational Statistics, 26(2):341–353, 2011.

J. M. Kraus and H. A. Kestler. A highly efficient multi-core algorithm for clustering extremely large datasets. BMC Bioinformatics, 11(1):169, 2010.

H. A. Kestler, J. Kraus, G. Palm, and F. Schwenker. On the effects of constraints in semi-supervised hierarchical clustering. In F. Schwenker and S. Marinai, editors, Artificial Neural Networks in Pattern Recognition (ANNPR 06), volume LNAI 4087, pages 57–66. Springer-Verlag, Heidelberg, 2006.

T. Mattfeldt, H. Wolter, R. Kemmerling, H.-W. Gottfried, and H. A. Kestler. Cluster analysis of comparative genomic hybridization (CGH) data using self-organizing maps: Application to prostate carcinomas. Analytical Cellular Pathology, 23(1):29–37, 2001.

Latest News

Our paper "A model of the onset of the Senescence Associated Secretory Phenotype after DNA damage induced Senescence" has been accepted for publishing in PLOS Computational Biology.

Our paper "The influence of multi-class feature selection on the prediction of diagnostic phenotypeshas been accepted for publishing in Neural Processing Letters.