Sparse Graph Regularization Non-Negative Matrix Factorization Based on Huber Loss Model for Cancer Data Analysis

Open Access

20 November 2019

journal article
research article
Published by Frontiers Media SA in Frontiers in Genetics

Vol. 10, 1054
https://doi.org/10.3389/fgene.2019.01054

Abstract

Non-negative matrix factorization (NMF) is a matrix decomposition method based on the square loss function. To exploit cancer information, cancer gene expression data often uses the NMF method to reduce dimensionality. Gene expression data usually have some noise and outliers, while the original NMF loss function is very sensitive to non-Gaussian noise. To improve the robustness and clustering performance of the algorithm, we propose a sparse graph regularization NMF based on Huber loss model for cancer data analysis (Huber-SGNMF). Huber loss is a function between L₁-norm and L₂-norm that can effectively handle non-Gaussian noise and outliers. Taking into account the sparsity matrix and data geometry information, sparse penalty and graph regularization terms are introduced into the model to enhance matrix sparsity and capture data manifold structure. Before the experiment, we first analyzed the robustness of Huber-SGNMF and other models. Experiments on The Cancer Genome Atlas (TCGA) data have shown that Huber-SGNMF performs better than other most advanced methods in sample clustering and differentially expressed gene selection.

Funding Information

National Natural Science Foundation of China (61572284, 61872220, 61873001)

This publication has 40 references indexed in Scilit:

An Efficient Non-Negative Matrix-Factorization-Based Approach to Collaborative Filtering for Recommender Systems
IEEE Transactions on Industrial Informatics, 2014
Progressive Image Denoising Through Hybrid Graph Laplacian Regularization: A Unified Framework
IEEE Transactions on Image Processing, 2014
Robust PCA based method for discovering differentially expressed genes
BMC Bioinformatics, 2013
The Anaphase-Promoting Complex or Cyclosome Supports Cell Survival in Response to Endoplasmic Reticulum Stress
PLOS ONE, 2012
Upregulation of Glycogen Synthase Kinase 3β in Human Colorectal Adenocarcinomas Correlates With Accumulation of CTNNB1
Clinical Colorectal Cancer, 2011
On epicardial potential reconstruction using regularization schemes with the L1-norm data term
Physics in Medicine & Biology, 2010
Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis
Bioinformatics, 2007
Inhibition of APCCdh1 Activity by Cdh1/Acm1/Bmh1 Ternary Complex Formation
Online Journal of Public Health Informatics, 2007
Non-negative Matrix Factorization for Face Recognition
Lecture Notes in Computer Science, 2002
Nonlinear Dimensionality Reduction by Locally Linear Embedding
Science, 2000

Cited by 7 articles