Identifying disease-specific genes based on their topological significance in protein networks

Abstract
Background The identification of key target nodes within complex molecular networks remains a common objective in scientific research. The results of pathway analyses are usually sets of fairly complex networks or functional processes that are deemed relevant to the condition represented by the molecular profile. To be useful in a research or clinical laboratory, the results need to be translated to the level of testable hypotheses about individual genes and proteins within the condition of interest. Results In this paper we describe novel computational methodology capable of predicting key regulatory genes and proteins in disease- and condition-specific biological networks. The algorithm builds shortest path network connecting condition-specific genes (e.g. differentially expressed genes) using global database of protein interactions from MetaCore. We evaluate the number of all paths traversing each node in the shortest path network in relation to the total number of paths going via the same node in the global network. Using these numbers and the relative size of the initial data set, we determine the statistical significance of the network connectivity provided through each node. We applied this method to gene expression data from psoriasis patients and identified many confirmed biological targets of psoriasis and suggested several new targets. Using predicted regulatory nodes we were able to reconstruct disease pathways that are in excellent agreement with the current knowledge on the pathogenesis of psoriasis. Conclusion The systematic and automated approach described in this paper is readily applicable to uncovering high-quality therapeutic targets, and holds great promise for developing network-based combinational treatment strategies for a wide range of diseases.