Refine Search

New Search

Results: 15

(searched for: doi:10.1155/2015/370640)
Save to Scifeed
Page of 1
Articles per Page
by
Show export options
  Select all
Journal of Statistics and Management Systems pp 1-10; https://doi.org/10.1080/09720510.2021.1960551

Abstract:
Housing demand arises under the influence of social, cultural, economic and demographic characteristics of each country Housing requirement refers to not only a numerical value, but also a structure that must carry healthy conditions with its environment. Nowadays, both housing sales and the price of properties are in increasing trend. In this study, the factors affecting the housing sales of the Turkey’s 81 provinces for 2019 were examined. Among alternative regression methods, negative binomial regression and quantile regression models were used and analysis were made on the basis of various quantile slices. As a result of the negative binomial regression model, it was found that the crude marriage rate, divorce rate, gross domestic product, unemployment rate and crude birth rate variables had a significant effect on the model. In quantile regression model analysis, quantile slices were evaluated separately. By using CICOMP type information criteria, it was seen that negative binomial regression model gave more effective results than quantile regression model.
Tuba Koç
Mathematical Problems in Engineering, Volume 2021, pp 1-9; https://doi.org/10.1155/2021/6198317

Abstract:
High-dimensional data sets frequently occur in several scientific areas, and special techniques are required to analyze these types of data sets. Especially, it becomes important to apply a suitable model in classification problems. In this study, a novel approach is proposed to estimate a statistical model for high-dimensional data sets. The proposed method uses analytical hierarchical process (AHP) and information criteria for determining the optimal PCs for the classification model. The high-dimensional “colon” and “gravier” datasets were used in evaluation part. Application results demonstrate that the proposed approach can be successfully used for modeling purposes.
Haydar Koç
Adıyaman University Journal of Science; https://doi.org/10.37094/adyujsci.755048

Abstract:
Renewable energy is a sustainable energy source that can be produced repeatedly by using the resources that exist in nature's own evolution. Renewable energy sources occupy an important place in the world and our country due to their renewability, minimal environmental impact, low operating and maintenance costs, and their national qualifications and reliable energy supply features. In this study renewable energy efficiency levels for the BRICS countries and Turkey were examined. In the study covering the period 2006-2015, we used the SFA method for efficiency analysis in input selection. We used to information complexity criteria to decide which input set is the best on renewable energy efficiency process. The selection results pointed out to the CO2 emission and Energy intensity as the most explanatory inputs. We observed that the selected inputs have significant effect on the renewable energy efficiencies. According to results, the renewable energy efficiency values follow approximately the same pattern for each country and do not vary significantly between the years. When comparing the renewable energy efficiencies among the countries, Brazil has the best performance with approximately %97 efficiency level and Russia has the worst one. The efficiency level of Turkey is rather weak but it is not the worst and the average efficiency is very close to China.
Sulaiman Olaniyi Abdulsalam, Abubakar Adamu Mohammed, Jumoke Falilat Ajao, Ronke S. Babatunde, Roseline Oluwaseun Ogundokun, Chiebuka T. Nnodim,
Lecture Notes in Business Information Processing pp 480-492; https://doi.org/10.1007/978-3-030-63396-7_32

The publisher has not yet granted permission to display this abstract.
Communications in Statistics - Theory and Methods, Volume 50, pp 2710-2721; https://doi.org/10.1080/03610926.2019.1708395

Abstract:
Information criterion is an essential measure in data analysis. Primarily, information criterion is used to choose the statistical models. Because of that role, the development of the criteria becomes very crucial issue. In this study, a modified version of Fisher information criterion (FIC) is proposed to improve the classical FIC. Shrinkage estimation is adopted within FIC and also an additional penalty term is added multiplicatively. Suggested criterion is experienced on Lasso regression. The performance of the modified FIC is illustrated on simulated and real data sets. Empirical evidences demonstrate the success of the modified version of FIC for model selection when comparing with traditional criteria.
Journal of Statistical Computation and Simulation, Volume 89, pp 2983-2996; https://doi.org/10.1080/00949655.2019.1647431

Abstract:
Elastic net (EN) is a regularization technique which is used for modelling and variable selection with high-dimensional data at the same time. In the literature, it is claimed that EN modelling can be used for undersized samples with high dimensions (i.e. n<<p). But, both the model matrix and the Gram matrix are not of full rank p and the inverse of the Gram matrix cannot be calculated. It degenerates and becomes singular. To overcome this problem in EN modelling, Mohebbi et. al. [A new data adaptive elastic net predictive model using hybridized smoothed covariance estimators with information complexity. J Stat Comput Simul. 2019;89(6):1060–1089] purposed a new adaptive elastic net (AEN) modelling using hybrid covariance estimators (HCEs) and information complexity (ICOMP) criteria. There are several forms of HCEs which can be used in AEN regression modelling. Thus, how to decide which HCEs is appropriate is an important problem to be solved. In this paper, we study the performance of the AEN models under several different HCEs using the ICOMP criterion for both the implementation of experimental data and Monte Carlo simulation study with different scenarios of the protocol.
Meng Li, Jianmei Zhao, Xuecang Li, Yang Chen, Chenchen Feng, Fengcui Qian, Yuejuan Liu, Jian Zhang, Jianzhong He, Bo Ai, et al.
Briefings In Bioinformatics, Volume 21, pp 1411-1424; https://doi.org/10.1093/bib/bbz078

Abstract:
With the increasing awareness of heterogeneity in cancers, better prediction of cancer prognosis is much needed for more personalized treatment. Recently, extensive efforts have been made to explore the variations in gene expression for better prognosis. However, the prognostic gene signatures predicted by most existing methods have little robustness among different datasets of the same cancer. To improve the robustness of the gene signatures, we propose a novel high-frequency sub-pathways mining approach (HiFreSP), integrating a randomization strategy with gene interaction pathways. We identified a six-gene signature (CCND1, CSF3R, E2F2, JUP, RARA and TCF7) in esophageal squamous cell carcinoma (ESCC) by HiFreSP. This signature displayed a strong ability to predict the clinical outcome of ESCC patients in two independent datasets (log-rank test, P = 0.0045 and 0.0087). To further show the predictive performance of HiFreSP, we applied it to two other cancers: pancreatic adenocarcinoma and breast cancer. The identified signatures show high predictive power in all testing datasets of the two cancers. Furthermore, compared with the two popular prognosis signature predicting methods, the least absolute shrinkage and selection operator penalized Cox proportional hazards model and the random survival forest, HiFreSP showed better predictive accuracy and generalization across all testing datasets of the above three cancers. Lastly, we applied HiFreSP to 8137 patients involving 20 cancer types in the TCGA database and found high-frequency prognosis-associated pathways in many cancers. Taken together, HiFreSP shows higher prognostic capability and greater robustness, and the identified signatures provide clinical guidance for cancer prognosis. HiFreSP is freely available via GitHub: https://github.com/chunquanlipathway/HiFreSP.
Tuba Koc, Haydar Koc, Emre Dünder
Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, Volume 23, pp 76-83; https://doi.org/10.19113/sdufenbed.436178

Abstract:
In most scientific studies quantitative data are used which take non-negative integer values, called count data. Count data are also used frequently in the context of regression analysis, which is one of the most basic analysis methods of statistical analysis. The regression models in which the dependent variable can be expressed by integers are defined as count models. In this study, the model selection in the context of count models was investigated by using classical selection methods and PSO algorithm. Applications were made on both simulation and real data. As a result, it has been shown that PSO algorithm can be used as an alternative method for PSO algorithm selection for count models when the number of model variables increases and the correlation values ​​between independent variables increases as compared to classical methods.
Shima Mohebbi, Esra Pamukcu,
Journal of Statistical Computation and Simulation, Volume 89, pp 1060-1089; https://doi.org/10.1080/00949655.2019.1576683

Abstract:
We develop a novel Adaptive Elastic Net (AEN) modelling using a new covariance-regularization approach via the Hybridized Smoothed Covariance Estimators (HSCEs) to identify and select the best subset of predictors in undersized high-dimensional data sets. We introduce and score the Consistent and Misspecification Resistant Information Measure of Complexity(CICOMPMisspec) criterion, and the Extended Consistent Akaike's Information Criterion with Fisher Information(CAICFE) in AEN models for the first time. We carry out a large Monte Carlo simulation study using the medianmean-squared-error (MMSE) to demonstrate and compare the performance of the MMSE prediction. This is done using Cross-validated Fit Adaptive Elastic Net (CV-AEN) to avoid double shrinkage by varying both the error variance and the correlation structure of the model. Later, the new proposed AEN model is applied to a real undersized benchmark data set to predict the Riboflavin (Vitamin B2) production to select the best subset of predictors to predict the production rate of vitamin B2 and provide the best predictive model. The proposed new approach enables a simple and reliable identification of the best subset of predictive genes of the production rate of Riboflavin (Vitamin B2) without an exhaustive search of all possible subset selection in undersized high-dimensional data. It is a new and novel approach that has generalizability to other regularized General Linear Regression (GLM) models to determine the best predictor space for undersized data.
Haydar Koç, , , Tuba Koç, Mehmet Ali Cengiz
Communications in Statistics - Theory and Methods, Volume 47, pp 5298-5306; https://doi.org/10.1080/03610926.2017.1390129

Abstract:
Modeling of count responses is widely performed via Poisson regression models. This paper covers the problem of variable selection in Poisson regression analysis. The basic emphasis of this paper is to present the usefulness of information complexity-based criteria for Poisson regression. Particle swarm optimization (PSO) algorithm was adopted to minimize the information criteria. A real dataset example and two simulation studies were conducted for highly collinear and lowly correlated datasets. Results demonstrate the capability of information complexity-type criteria. According to the results, information complexity-type criteria can be effectively used instead of classical criteria in count data modeling via the PSO algorithm.
Published: 1 September 2017
by MDPI
Applied Sciences, Volume 7; https://doi.org/10.3390/app7090900

Abstract:
An approach to distinguish eight kinds of different human cells by Raman spectroscopy was proposed and demonstrated in this paper. Original spectra of suspension cells in the frequency range of 623~1783 cm−1 were acquired and pre-processed by baseline calibration, and principal component analysis (PCA) was employed to extract the useful spectral information. To develop a robust discrimination model, a linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) were attempted comparatively in the work. The results showed that the QDA model is better than the LDA model. The optimal QDA model was generated with 12 principal components. The classification rates are 100% in the calibration and prediction set, respectively. From the experimental results, it is concluded that Raman spectroscopy combined with appropriate discriminant analysis methods has significant potential in human cell detection.
Pugalendhi Ganesh Kumar, Muthu Subash Kavitha, Byeong-Cheol Ahn
Published: 9 December 2016
Abstract:
This study describes a novel approach to reducing the challenges of highly nonlinear multiclass gene expression values for cancer diagnosis. To build a fruitful system for cancer diagnosis, in this study, we introduced two levels of gene selection such as filtering and embedding for selection of potential genes and the most relevant genes associated with cancer, respectively. The filter procedure was implemented by developing a fuzzy rough set (FR)-based method for redefining the criterion function of f-information (FI) to identify the potential genes without discretizing the continuous gene expression values. The embedded procedure is implemented by means of a water swirl algorithm (WSA), which attempts to optimize the rule set and membership function required to classify samples using a fuzzy-rule-based multiclassification system (FRBMS). Two novel update equations are proposed in WSA, which have better exploration and exploitation abilities while designing a self-learning FRBMS. The efficiency of our new approach was evaluated on 13 multicategory and 9 binary datasets of cancer gene expression. Additionally, the performance of the proposed FRFI-WSA method in designing an FRBMS was compared with existing methods for gene selection and optimization such as genetic algorithm (GA), particle swarm optimization (PSO), and artificial bee colony algorithm (ABC) on all the datasets. In the global cancer map with repeated measurements (GCM_RM) dataset, the FRFI-WSA showed the smallest number of 16 most relevant genes associated with cancer using a minimal number of 26 compact rules with the highest classification accuracy (96.45%). In addition, the statistical validation used in this study revealed that the biological relevance of the most relevant genes associated with cancer and their linguistics detected by the proposed FRFI-WSA approach are better than those in the other methods. The simple interpretable rules with most relevant genes and effectively classified samples suggest that the proposed FRFI-WSA approach is reliable for classification of an individual’s cancer gene expression data with high precision and therefore it could be helpful for clinicians as a clinical decision support system.
Hamparsum Bozdogan, Esra Pamukçu
Optimization Challenges in Complex, Networked and Risky Systems pp 140-170; https://doi.org/10.1287/educ.2016.0154

Abstract:
This tutorial introduces and develops two computationally feasible intelligent feature extraction techniques that address potentially daunting statistical and combinatorial problems. The first part of the tutorial employs a three-way hybrid of probabilistic principal component analysis (PPCA) to reduce the dimensionality of the dependent variables, multivariate regression (MVR) models that account for misspecification of the distributional assumption to determine a predictive operating model for glass composition for automobiles, and the genetic algorithm (GA) as the optimizer, along with the misspecification-resistant form of Bozdogan’s information measure of complexity (ICOMP) as the fitness function. The second part of the tutorial is devoted to dimension reduction via a novel adaptive elastic net regression model. We used the adaptive elastic net (AEN) model to reduce the dimension of a Japanese stock index called TOPIX as a response to build a best predictive model when we have a “large p, small n” problem. Our results show the remarkable dimension reduction in both of these real-life examples of wide data sets, which demonstrates the versatility and the utility of the two proposed novel statistical data modeling techniques.
Page of 1
Articles per Page
by
Show export options
  Select all
Back to Top Top