Identification of Significant Proteins in Coronavirus Disease 2019 Protein-Protein Interaction Using Principal Component Analysis and ClusterONE

Abstract
Coronavirus Disease 2019 (COVID-19) will cause disease complications and organ damage due to excessive inflammatory reactions if left untreated. Computational analysis of protein-protein interactions can be carried out in various ways, including topological analysis and clustering of protein-protein interaction networks. Topological analysis can identify significant proteins by measuring the most important nodes with centrality measurements. By using Principal Component Analysis (PCA), the types of centrality measures were extracted into the overall centrality value. The study aimed to found significant proteins in COVID-19 protein-protein interactions using PCA and ClusterONE. This study used 57 proteins associated with COVID-19 to obtain protein networks. All of these proteins are homo sapiens organism. The number of proteins and the number of interactions from 57 proteins were 357 proteins and 1686 interactions. The results of this study consisted of two clusters; the best cluster was the first cluster with a lower p-value but had an average overall centrality value that closed to the second clus-ter. There are twenty important proteins in that cluster, and all of these proteins are related to COVID-19. These proteins are expected to be used in the process of discovering medicinal compounds in COVID-19