Journal of Data Analysis and Information Processing

Journal Information
ISSN / EISSN : 2327-7211 / 2327-7203
Published by: Scientific Research Publishing, Inc. (10.4236)
Total articles ≅ 147
Archived in

Latest articles in this journal

Sarah Nyanjara, Dina Machuve, Pirkko Nykanen
Journal of Data Analysis and Information Processing, Volume 10, pp 170-183;

High maternal and child deaths in developing countries are frequently linked to poor health services provided to pregnant women and children. To improve the quality of maternal, neonatal and child health (MNCH) services, the government and other stakeholders in MNCH emphasize the importance of quality assessment. However, effective quality assessment approaches are mostly lacking in most developing countries, particularly in Tanzania. This study, therefore, aimed at developing a quality assessment approach that can effectively assess and report on the quality of MNCH services. Due to the need for a good quality assessment approach that suits a resource-constrained environment, machine learning-based approach was proposed and developed. K-means algorithm was used to develop a clustering model that groups MNCH data and performs cluster summarization to discover the knowledge portrayed in each group on the quality of MNCH services. Results confirmed the clustering model’s ability to assign the data points into appropriate clusters; cluster analysis with the collaboration of MNCH experts successfully discovered insights on the quality of services portrayed by each group.
Ihar Yeuseyenka, Ihar Melnikau, Ihar Yemelyanov
Journal of Data Analysis and Information Processing, Volume 10, pp 127-141;

The purpose of the article is to develop a methodology for automating the detection and selection of moving objects. The detection and separation of moving objects based on impulse and recurrence neural networks simulation. The result of the work is a developed motion detector based on impulse and recurrence neural networks and an automated system developed on the basis of this detector for detecting and separating moving objects and is ready for practical application. The feasibility of integrating the developed motion detector with Emgu CV (OpenCV) image processing package, multimedia framework functions, and DirectShow application programming interface were investigated. The proposed approach and software for the detection and separating of moving objects in video images using neural networks can be integrated into more sophisticated specialized computer-aided video surveillance systems, IoT (Internet of Things), IoV (Internet of Vehicles), etc.
Tariq Saeed Mian, Fahad Ghabban
Journal of Data Analysis and Information Processing, Volume 10, pp 155-169;

It is very important for organizations to develop a competitive advantage for long-term survival in the market. For this purpose, the main objective of the study was to assess the role of data mining and employee training & Development to gain a competitive advantage. Moreover, the mediating role of personnel role and knowledge management is also assessed in the present study. The data in the present study were collected from the employees of SMEs in KSA using convenient sampling. The response rate of the study was 58.36%. For the analysis of the collected data, the study used PLS 3.2.9. The findings of the study reveal that data mining and training and development plays an important role for organizations to gain a competitive advantage through Knowledge management and personnel role. The findings of the study fill the gap of limited studies conducted regarding SMEs of KSA to gain a competitive advantage. The findings of the study are helpful for the policymakers of SMEs around the globe.
Busrat Jahan, Mahfuja Khatun, Zinat Ara Zabu, Afranul Hoque, Sayed Uddin Rayhan
Journal of Data Analysis and Information Processing, Volume 10, pp 43-57;

In our study, we chose python as the programming platform for finding an Automatic Bengali Document Summarizer. English has sufficient tools to process and receive summarized records. However, there is no specifically applicable to Bengali since Bengali has a lot of ambiguity, it differs from English in terms of grammar. Afterward, this language holds an important place because this language is spoken by 26 core people all over the world. As a result, it has taken a new method to summarize Bengali documents. The proposed system has been designed by using the following stages: pre-processing the sample doc/input doc, word tagging, pronoun replacement, sentence ranking, as well as summary. Pronoun replacement has been used to reduce the incidence of swinging pronouns in the performance review. We ranked sentences based on sentence frequency, numerical figures, and pronoun replacement. Checking the similarity between two sentences in order to exclude one since it has less duplication. Hereby, we’ve taken 3000 data as input from newspaper and book documents and learned the words to be appropriate with syntax. In addition, to evaluate the performance of the designed summarizer, the design system looked at the different documents. According to the assessment method, the recall, precision, and F-score were 0.70, 0.82 and 0.74, respectively, representing 70%, 82% and 74% recall, precision, and F-score. It has been found that the proper pronoun replacement was 72%.
Iraklis M. Spiliotis, Alexandros S. Peppas, Nikolaos D. Karampasis, Yiannis S. Boutalis
Journal of Data Analysis and Information Processing, Volume 10, pp 91-109;

The identification of objects in binary images is a fundamental task in image analysis and pattern recognition tasks. The Euler number of a binary image is an important topological measure which is used as a feature in image analysis. In this paper, a very fast algorithm for the detection and localization of the objects and the computation of the Euler number of a binary image is proposed. The proposed algorithm operates in one scan of the image and is based on the Image Block Representation (IBR) scheme. The proposed algorithm is more efficient than conventional pixel based algorithms in terms of execution speed and representation of the extracted information.
Daniel A. Abaye, Emmanuel B. Odoom, Ernest Y. Boateng, Irene A. Agbo, John-Bosco Diekuu, Samuel Agana
Journal of Data Analysis and Information Processing, Volume 10, pp 142-154;

Clinical assessment of fluid volume status in children during malaria can be taxing and often inaccurate. During malaria, changes in fluid volume are rather multifarious and estimating this parameter, especially in sick children is very challenging for clinicians who frequently rely on indices such as long capillary refill times, tachycardia, central venous pressure and decreased urine volume as guides. Here, we present the UHAS-MIDA, an open-source software tool that calculates the red blood cell (RBC) concentration and blood volume during malaria in children determined using a stable isotope of chromium (53Cr as the label) by gas chromatography-mass spectrometry in selective ion monitoring (GC/MS-SIM) analysis. A key component involves the determination of the compositions of the most abundant naturally occurring isotopes of Cr (50Cr, 52Cr, 53Cr), and converting the proportions into a 3 × 3 matrix. To estimate unknown proportions of chromium isotopic mixtures from the measured abundances of three ions, an inverse matrix was calculated. The inverse together with several inputs is then used to calculate the corrected MS ion abundances. Thus, we constructed the software tool UHAS- MIDA using HTML, CSS/Bootstrap, JavaScript, and PHP scripting languages. The tool enables the user to efficiently determine RBC concentration and fluid volume. The source code, binary packages and associated materials for UHAS-MIDA are freely available at
AsadI Srinivasulu, Tarkeshwar Barua, Srinivas Nowduri, Madhusudhana Subramanyam, Sivaram Rajeyyagari
Journal of Data Analysis and Information Processing, Volume 10, pp 78-89;

COVID-19 virus is certainly considered as one of the harmful viruses amongst all the illnesses in biological science. COVID-19 symptoms are fever, cough, sore throat, and headache. The paper gave a singular function for the prediction of most of the COVID-19 virus diseases and presented with the Convolutional Neural Networks and Logistic Regression which might be the supervised learning and gaining knowledge of strategies for most of COVID-19 virus diseases detection. The proposed system makes use of an 8-fold pass determination to get a correct result. The COVID-19 virus analysis dataset is taken from Microsoft Database, Kaggle, and UCI websites gaining knowledge of the repository. The proposed studies investigate Convolutional Neural Networks (CNN) and Logistic Regression (LR) about the usage of the UCI database, Kaggle, and Google Database Datasets. This paper proposed a hybrid method for COVID-19 virus, most disease analyses through reducing the dimensionality of capabilities the usage of Logistic Regression (LR), after which making use of the brand new decreased function dataset to Convolutional Neural Networks and Logistic regression. The proposed method received the accuracy of 78.82%, sensitiveness of 97.41%, and specialness of 98.73%. The overall performance of the proposed system is appraised thinking about performance, accuracy, error rate, sensitiveness, particularity, correlation and coefficient. The proposed strategies achieved the accuracy of 78.82% and 97.41% respectively through Convolutional Neural Networks and Logistic Regression.
Abdelkhalek I. Alastal, Ashraf Hassan Shaqfa
Journal of Data Analysis and Information Processing, Volume 10, pp 110-126;

Artificial intelligence has significantly altered many job workflows, hence expanding earlier notions of limitations, outcomes, size, and prices. GeoAI is a multidisciplinary field that encompasses computer science, engineering, statistics, and spatial science. Because this subject focuses on real-world issues, it has a significant impact on society and the economy. A broad context incorporating fundamental questions of theory, epistemology, and the scientific method is used to bring artificial intelligence (Al) and geography together. This connection has the potential to have far-reaching implications for the geographic study. GeoAI, or the combination of geography with artificial intelligence, offers unique solutions to a variety of smart city issues. This paper provides an overview of GeoAI technology, including the definition of GeoAI and the differences between GeoAI and traditional AI. Key steps to successful geographic data analysis include integrating AI with GIS and using GeoAI tools and technologies. Also shown are key areas of applications and models in GeoAI, likewise challenges to adopt GeoAI methods and technology as well as benefits. This article also included a case study on the use of GeoAI in Kuwait, as well as a number of recommendations.
Guanggong Ge, Quanlong Guan, Lusheng Wu, Weiqi Luo, Xingyu Zhu
Journal of Data Analysis and Information Processing, Volume 10, pp 22-42;

Online learning is a very important means of study, and has been adopted in many countries worldwide. However, only recently are researchers able to collect and analyze massive online learning datasets due to the COVID-19 epidemic. In this article, we analyze the difference between online learner groups by using an unsupervised machine learning technique, i.e., k-prototypes clustering. Specifically, we use a questionnaire designed by domain experts to collect various online learning data, and investigate students’ online learning behavior and learning outcomes through analyzing the collected questionnaire data. Our analysis results suggest that students with better learning media generally have better online learning behavior and learning results than those with poor online learning media. In addition, both in economically developed or undeveloped regions, the number of students with better learning media is less than the number of students with poor learning media. Finally, the results presented here show that whether in an economically developed or an economically undeveloped region, the number of students who are enriched with learning media available is an important factor that affects online learning behavior and learning outcomes.
Beilei He, Weiyi Han, Suet Ying Isabelle Hon
Journal of Data Analysis and Information Processing, Volume 10, pp 1-21;

Predicting stock price movement direction is a challenging problem influenced by different factors and capricious events. The conventional stock price prediction machine learning models heavily rely on the internal financial features, especially the stock price history. However, there are many outside-of-company features that deeply interact with the companies’ stock price performance, especially during the COVID period. In this study, we selected 9 COVID vaccine companies and collected their relevant features over the past 20 months. We added handcrafted external information, including COVID-related statistics and company-specific vaccine progress information. We implemented, evaluated, and compared several machine learning models, including Multilayer Perceptron Neural Networks with logistic regression and decision trees with boosting and bagging algorithms. The results suggest that the application of feature engineering and data mining techniques can effectively enhance the performance of models predicting stock price movement during the COVID period. The results show that COVID-related handcrafted features help to increase the model prediction accuracy by 7.3% and AUROC by 6.5% on average. Further exploration showed that with data selection the decision tree model with gradient, boosting algorithm achieved 70% in AUROC and 66% in the accuracy.
Back to Top Top