Multi-metric and multi-substructure biclustering analysis for gene expression data
- 1 January 2005
- conference paper
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05)
- p. 123-134
- https://doi.org/10.1109/csb.2005.40
Abstract
A good number of biclustering algorithms have been proposed for grouping gene expression data. Many of them have adopted matrix norms to define the similarity score of a bicluster. We shall show that almost all matrix metrics can be converted into vector norms while presenting the rank equivalence. Vector norms provide a much more efficient vehicle for biclustering analysis and computation. The advantages are two folds: ease of analysis and saving of computation. Most existing biclustering algorithms have also implicitly assumed the use of univariate (i.e., single metric) evaluation for identifying biclusters. Such an approach however overlooks the fundamental principle that genes (even though they may belong to the same gene group) (1) may be subdivided into different substructures; and (2) they may be co-expressed via a diversity of coherence models (a gene may participate in multiple pathways that may or may not be co-active under all conditions). The former leads to the adoption of a multi-substurcture analysis, while the latter to the multivariate analysis. This paper will show that the proposed multivariate and multi-subscluster analysis is very effective in identifying and classifying biologically relevant groups in genes and conditions. For example, it has successfully yielded highly discriminant and accurate classification based on known ribosomal gene groups.Keywords
This publication has 6 references indexed in Scilit:
- Cluster analysis for gene expression data: a surveyIEEE Transactions on Knowledge and Data Engineering, 2004
- Molecular classification of cutaneous malignant melanoma by gene expression profilingNature, 2000
- Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression MonitoringScience, 1999
- Expression profiling using cDNA microarraysNature Genetics, 1999
- Mathematical Classification and ClusteringPublished by Springer Science and Business Media LLC ,1996
- Direct Clustering of a Data MatrixJournal of the American Statistical Association, 1972