Discovering Relevance-Dependent Bicluster Structure from Relational Data: A Model and Algorithm

1 November 2018

journal article
Published by Japanese Society for Artificial Intelligence in Transactions of the Japanese Society for Artificial Intelligence

Vol. 33 (6), B-I46_1-I46_1
https://doi.org/10.1527/tjsai.b-i46

Abstract

We propose a statistical model for relevance-dependent biclustering to analyze relational data. The proposed model factorizes relational data into bicluster structure with two features: (1) each object in a cluster has a relevance value, which indicates how strongly the object relates to the cluster and (2) all clusters are related to at least one dense block. These features simplify the task of understanding the meaning of each cluster because only a few highly relevant objects need to be inspected. We introduced the Relevance-Dependent Bernoulli Distribution (R-BD) as a prior for relevance-dependent binary matrices and proposed the novel Relevance-Dependent Infinite Biclustering (R-IB) model, which automatically estimates the number of clusters. Posterior inference can be performed efficiently using a collapsed Gibbs sampler because the parameters of the R-IB model can be fully marginalized out. Experimental results show that the R-IB extracts more essential bicluster structure with better computational efficiency than conventional models. We further observed that the biclustering results obtained by R-IB facilitate interpretation of the meaning of each cluster.

Keywords

This publication has 19 references indexed in Scilit:

Beyond Blocks: Hyperbolic Community Detection
Lecture Notes in Computer Science, 2014
An Extension of the Infinite Relational Model Incorporating Interaction between Objects
Lecture Notes in Computer Science, 2013
Hierarchical Dirichlet Processes
Journal of the American Statistical Association, 2006
The Enron Corpus: A New Dataset for Email Classification Research
Lecture Notes in Computer Science, 2004
Estimation and Prediction for Stochastic Blockstructures
Journal of the American Statistical Association, 2001
Bayesian Density Estimation and Inference Using Mixtures
Journal of the American Statistical Association, 1995
The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem
Journal of the American Statistical Association, 1994
Default Probability
Cognitive Science, 1991
Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems
The Annals of Statistics, 1974
Ferguson Distributions Via Polya Urn Schemes
The Annals of Statistics, 1973