The advent of high-throughput biomolecular technologies has made high-dimensional biological data available for the investigation of many clinical settings. The large numbers of features and low numbers of probes in such data sets poses many challenges for their analysis. Machine learning approaches and statistical methods are essential for the interpretation of the data. For example, clustering methods can detect groups of similar probes. Feature selection techniques are employed to identify features (e.g. marker genes) that are relevant to distinguish certain phenotypes. Classification algorithms can predict the phenotype of a probe according to the measurements.