Refine Search

New Search

Results in Journal IEICE Transactions on Information and Systems: 6,098

(searched for: journal_id:(711357))
Page of 122
Articles per Page
Show export options
  Select all
Isana Funahashi, Taichi Yoshida, Xi Zhang, Masahiro Iwahashi
IEICE Transactions on Information and Systems, pp 123-133;

In this paper, we propose an image adjustment method for multi-exposure images based on convolutional neural networks (CNNs). We call image regions without information due to saturation and object moving in multi-exposure images lacking areas in this paper. Lacking areas cause the ghosting artifact in fused images from sets of multi-exposure images by conventional fusion methods, which tackle the artifact. To avoid this problem, the proposed method estimates the information of lacking areas via adaptive inpainting. The proposed CNN consists of three networks, warp and refinement, detection, and inpainting networks. The second and third networks detect lacking areas and estimate their pixel values, respectively. In the experiments, it is observed that a simple fusion method with the proposed method outperforms state-of-the-art fusion methods in the peak signal-to-noise ratio. Moreover, the proposed method is applied for various fusion methods as pre-processing, and results show obviously reducing artifacts.
Wenjing Zhang, Peng Song, Wenming Zheng
IEICE Transactions on Information and Systems, pp 184-188;

In this letter, we propose a novel transferable sparse regression (TSR) method, for cross-database facial expression recognition (FER). In TSR, we firstly present a novel regression function to regress the data into a latent representation space instead of a strict binary label space. To further alleviate the influence of outliers and overfitting, we impose a row sparsity constraint on the regression term. And a pairwise relation term is introduced to guide the feature transfer learning. Secondly, we design a global graph to transfer knowledge, which can well preserve the cross-database manifold structure. Moreover, we introduce a low-rank constraint on the graph regularization term to uncover additional structural information. Finally, several experiments are conducted on three popular facial expression databases, and the results validate that the proposed TSR method is superior to other non-deep and deep transfer learning methods.
Yukasa Murakami, Masateru Tsunoda
IEICE Transactions on Information and Systems, pp 21-25;

Although many software engineering studies have been conducted, it is not clear whether they meet the needs of software development practitioners. Some studies evaluated the effectiveness of software engineering research by practitioners, to clarify the research satisfies the needs of the practitioners. We performed replicated study of them, recruiting practitioners who mainly belong to SMEs (small and medium-sized enterprises) to the survey. We asked 16 practitioners to evaluate cutting-edge software engineering studies presented in ICSE 2016. In the survey, we set the viewpoint of the evaluation as the effectiveness for the respondent's own work. As a result, the ratio of positive answers (i.e., the answers were greater than 2 on a 5-point scale) was 33.3%, and the ratio was lower than past studies. The result was not affected by the number of employees in the respondent's company, but would be affected by the viewpoint of the evaluation.
Keitaro Nakasai, Masateru Tsunoda, Kenichi Matsumoto
IEICE Transactions on Information and Systems, pp 31-36;

Software developers often use a web search engine to improve work efficiency. However, web search strategies (e.g., frequently changing web search keywords) may be different for each developer. In this study, we attempted to define a better web search strategy. Although many previous studies analyzed web search behavior in programming, they did not provide guidelines for web search strategies. To suggest guidelines for web search strategies, we asked 10 subjects four questions about programming which they had to solve, and analyzed their behavior. In the analysis, we focused on the subjects' task time and the web search metrics defined by us. Based on our experiment, to enhance the effectiveness of the search, we suggest (1) that one should not go through the next search result pages, (2) the number of keywords in queries should be suppressed, and (3) previously used keywords must be avoided when creating a new query.
Jie Tan, Jianmin Pang, Cong Liu
IEICE Transactions on Information and Systems, pp 26-30;

Due to the rapid development of different processors, e.g., x86 and Sunway, software porting between different platforms is becoming more frequent. However, the migrated software's execution efficiency on the target platform is different from that of the source platform, and most of the previous studies have investigated the improvement of the efficiency from the hardware perspective. To the best of our knowledge, this is the first paper to exclusively focus on studying what software factors can result in performance change after software migration. To perform our study, we used SonarQube to detect and measure five software factors, namely Duplicated Lines (DL), Code Smells Density (CSD), Big Functions (BF), Cyclomatic Complexity (CC), and Complex Functions (CF), from 13 selected projects of SPEC CPU2006 benchmark suite. Then, we measured the change of software performance by calculating the acceleration ratio of execution time before (x86) and after (Sunway) software migration. Finally, we performed a multiple linear regression model to analyze the relationship between the software performance change and the software factors. The results indicate that the performance change of software migration from the x86 platform to the Sunway platform is mainly affected by three software factors, i.e., Code Smell Density (CSD), Cyclomatic Complexity (CC), and Complex Functions (CF). The findings can benefit both researchers and practitioners.
Keigo Taga, Junjun Zheng, Koichi Mouri, Shoichi Saito, Eiji Takimoto
IEICE Transactions on Information and Systems, pp 105-115;

A wide range of communication protocols has recently been developed to address service diversification. At the same time, firewalls (FWs) are installed at the boundaries between internal networks, such as those owned by companies and homes, and the Internet. In general, FWs are configured as whitelists and release only the port corresponding to the service to be used and block communication from other ports. In a previous study, we proposed a method for traversing a FW and enabling communication by inserting a pseudo-transmission control protocol (TCP) header imitating HTTPS into a packet, which normally would be blocked by the FW. In that study, we confirmed the efficiency of the proposed method via its implementation and experiments. Even though common encapsulating techniques work on end-nodes, the previous implementation worked on the relay node assuming a router. Further, middleboxes, which overwrite L3 and L4 headers on the Internet, need to be taken into consideration. Accordingly, we re-implemented the proposed method into an end-node and added a feature countering a typical middlebox, i.e., NAPT, into our implementation. In this paper, we describe the functional confirmation and performance evaluations of both versions of the proposed method.
Huy H. Nguyen, Minoru Kuribayashi, Junichi Yamagishi, Isao Echizen
IEICE Transactions on Information and Systems, pp 65-77;

Deep neural networks (DNNs) have achieved excellent performance on several tasks and have been widely applied in both academia and industry. However, DNNs are vulnerable to adversarial machine learning attacks in which noise is added to the input to change the networks' output. Consequently, DNN-based mission-critical applications such as those used in self-driving vehicles have reduced reliability and could cause severe accidents and damage. Moreover, adversarial examples could be used to poison DNN training data, resulting in corruptions of trained models. Besides the need for detecting adversarial examples, correcting them is important for restoring data and system functionality to normal. We have developed methods for detecting and correcting adversarial images that use multiple image processing operations with multiple parameter values. For detection, we devised a statistical-based method that outperforms the feature squeezing method. For correction, we devised a method that uses for the first time two levels of correction. The first level is label correction, with the focus on restoring the adversarial images' original predicted labels (for use in the current task). The second level is image correction, with the focus on both the correctness and quality of the corrected images (for use in the current and other tasks). Our experiments demonstrated that the correction method could correct nearly 90% of the adversarial images created by classical adversarial attacks and affected only about 2% of the normal images.
Daniel Moritz Marutschke, Victor V. Kryssanov, Patricia Brockmann
IEICE Transactions on Information and Systems, pp 2-10;

Global software engineering education faces unique challenges to reflect as close as possible real-world distributed team development in various forms. The complex nature of planning, collaborating, and upholding partnerships present administrative difficulties on top of budgetary constrains. These lead to limited opportunities for students to gain international experiences and for researchers to propagate educational and practical insights. This paper presents an empirical view on three different course structures conducted by the same research and educational team over a four-year time span. The courses were managed in Japan and Germany, facing cultural challenges, time-zone differences, language barriers, heterogeneous and homogeneous team structures, amongst others. Three semesters were carried out before and one during the Covid-19 pandemic. Implications for a recent focus on online education for software engineering education and future directions are discussed. As administrational and institutional differences typically do not guarantee the same number of students on all sides, distributed teams can be 1. balanced, where the number of students on one side is less than double the other, 2. unbalanced, where the number of students on one side is significantly larger than double the other, or 3. one-sided, where one side lacks students altogether. An approach for each of these three course structures is presented and discussed. Empirical analyses and reoccurring patterns in global software engineering education are reported. In the most recent three global software engineering classes, students were surveyed at the beginning and the end of the semester. The questionnaires ask students to rank how impactful they perceive factors related to global software development such as cultural aspects, team structure, language, and interaction. Results of the shift in mean perception are compared and discussed for each of the three team structures.
Jiayi Li, Lin Yang, Junyan Yi, Haichuan Yang, Yuki Todo, Shangce Gao
IEICE Transactions on Information and Systems, pp 189-192;

Differential Evolution (DE) algorithm is simple and effective. Since DE has been proposed, it has been widely used to solve various complex optimization problems. To further exploit the advantages of DE, we propose a new variant of DE, termed as ranking-based differential evolution (RDE), by performing ranking on the population. Progressively better individuals in the population are used for mutation operation, thus improving the algorithm's exploitation and exploration capability. Experimental results on a number of benchmark optimization functions show that RDE significantly outperforms the original DE and performs competitively in comparison with other two state-of-the-art DE variants.
Rio Kurokawa, Kazuki Yamato, Madoka Hasegawa
IEICE Transactions on Information and Systems, pp 54-64;

In recent years, several reversible contrast-enhancement methods for color images using digital watermarking have been proposed. These methods can restore an original image from a contrast-enhanced image, in which the information required to recover the original image is embedded with other payloads. In these methods, the hue component after enhancement is similar to that of the original image. However, the saturation of the image after enhancement is significantly lower than that of the original image, and the obtained image exhibits a pale color tone. Herein, we propose a method for enhancing the contrast and saturation of color images and nearly preserving the hue component in a reversible manner. Our method integrates red, green, and blue histograms and preserves the median value of the integrated components. Consequently, the contrast and saturation improved, whereas the subjective image quality improved. In addition, we confirmed that the hue component of the enhanced image is similar to that of the original image. We also confirmed that the original image was perfectly restored from the enhanced image. Our method can contribute to the field of digital photography as a legal evidence. The required storage space for color images and issues pertaining to evidence management can be reduced considering our method enables the creation of color images before and after the enhancement of one image.
Hyun Kwon
IEICE Transactions on Information and Systems, pp 170-174;

Deep neural networks show good performance in image recognition, speech recognition, and pattern analysis. However, deep neural networks show weaknesses, one of which is vulnerability to backdoor attacks. A backdoor attack performs additional training of the target model on backdoor samples that contain a specific trigger so that normal data without the trigger will be correctly classified by the model, but the backdoor samples with the specific trigger will be incorrectly classified by the model. Various studies on such backdoor attacks have been conducted. However, the existing backdoor attack causes misclassification by one classifier. In certain situations, it may be necessary to carry out a selective backdoor attack on a specific model in an environment with multiple models. In this paper, we propose a multi-model selective backdoor attack method that misleads each model to misclassify samples into a different class according to the position of the trigger. The experiment for this study used MNIST and Fashion-MNIST as datasets and TensorFlow as the machine learning library. The results show that the proposed scheme has a 100% average attack success rate for each model while maintaining 97.1% and 90.9% accuracy on the original samples for MNIST and Fashion-MNIST, respectively.
Weiwei Luo, Wenpeng Zhou, Jinglong Fang, Lingyan Fan
IEICE Transactions on Information and Systems, pp 180-183;

Recently, channel-aware steganography has been presented for high security. The corresponding selection-channel-aware (SCA) detecting algorithms have also been proposed for improving the detection performance. In this paper, we propose a novel detecting algorithm of JPEG steganography, where the embedding probability and block evaluation are integrated into the new probability. This probability can embody the change due to data embedding. We choose the same high-pass filters as maximum diversity cascade filter residual (MD-CFR) to obtain different image residuals and a weighted histogram method is used to extract detection features. Experimental results on detecting two typical steganographic methods show that the proposed method can improve the performance compared with the state-of-art methods.
Tetsuya Kojima, Kento Akimoto
IEICE Transactions on Information and Systems, pp 46-53;

Data hiding techniques are usually applied into digital watermarking or digital fingerprinting, which is used to protect intellectual property rights or to avoid illegal copies of the original works. It has been pointed out that data hiding can be utilized as a communication medium. In conventional digital watermarking frameworks, it is required that the difference between the cover objects and the stego objects are quite small, such that the difference cannot be recognized by human sensory systems. On the other hand, the authors have proposed a ‘hearable’ data hiding technique for audio signals that can carry secret messages and can be naturally recognized as a musical piece by human ears. In this study, we extend the idea of the hearable data hiding into video signals by utilizing the visual effects. As visual effects, we employ fade-in and fade-out effects which can be used as a kind of visual rendering for scene transitions. In the proposed schemes, secret messages are generated as one-dimensional barcodes which are used for fade-in or fade-out effects. The present paper shows that the proposed schemes have the high accuracy in extracting the embedded messages even from the video signals captured by smartphones or tablets. It is also shown that the video signals conveying the embedded messages can be naturally recognized by human visual systems through subjective evaluation experiments.
Kenshiro Tamata, Tomohiro Mashita
IEICE Transactions on Information and Systems, pp 134-140;

A typical approach to reconstructing a 3D environment model is scanning the environment with a depth sensor and fitting the accumulated point cloud to 3D models. In this kind of scenario, a general 3D environment reconstruction application assumes temporally continuous scanning. However in some practical uses, this assumption is unacceptable. Thus, a point cloud matching method for stitching several non-continuous 3D scans is required. Point cloud matching often includes errors in the feature point detection because a point cloud is basically a sparse sampling of the real environment, and it may include quantization errors that cannot be ignored. Moreover, depth sensors tend to have errors due to the reflective properties of the observed surface. We therefore make the assumption that feature point pairs between two point clouds will include errors. In this work, we propose a feature description method robust to the feature point registration error described above. To achieve this goal, we designed a deep learning based feature description model that consists of a local feature description around the feature points and a global feature description of the entire point cloud. To obtain a feature description robust to feature point registration error, we input feature point pairs with errors and train the models with metric learning. Experimental results show that our feature description model can correctly estimate whether the feature point pair is close enough to be considered a match or not even when the feature point registration errors are large, and our model can estimate with higher accuracy in comparison to methods such as FPFH or 3DMatch. In addition, we conducted experiments for combinations of input point clouds, including local or global point clouds, both types of point cloud, and encoders.
Akira Tanaka, Masanari Nakamura, Hideyuki Imai
IEICE Transactions on Information and Systems, pp 116-122;

The solution of the ordinary kernel ridge regression, based on the squared loss function and the squared norm-based regularizer, can be easily interpreted as a stochastic linear estimator by considering the autocorrelation prior for an unknown true function. As is well known, a stochastic affine estimator is one of the simplest extensions of the stochastic linear estimator. However, its corresponding kernel regression problem is not revealed so far. In this paper, we give a formulation of the kernel regression problem, whose solution is reduced to a stochastic affine estimator, and also give interpretations of the formulation.
Isamu Hasegawa, Tomoyuki Yokogawa
IEICE Transactions on Information and Systems, pp 78-91;

Visual script languages with a node-based interface have commonly been used in the video game industry. We examined the bug database obtained in the development of FINAL FANTASY XV (FFXV), and noticed that several types of bugs were caused by simple mis-descriptions of visual scripts and could therefore be mechanically detected. We propose a method for the automatic verification of visual scripts in order to improve productivity of video game development. Our method can automatically detect those bugs by using symbolic model checking. We show a translation algorithm which can automatically convert a visual script to an input model for NuSMV that is an implementation of symbolic model checking. For a preliminary evaluation, we applied our method to visual scripts used in the production for FFXV. The evaluation results demonstrate that our method can detect bugs of scripts and works well in a reasonable time.
Koji Kamma, Sarimu Inoue, Toshikazu Wada
IEICE Transactions on Information and Systems, pp 161-169;

Pruning is an effective technique to reduce computational complexity of Convolutional Neural Networks (CNNs) by removing redundant neurons (or weights). There are two types of pruning methods: holistic pruning and layer-wise pruning. The former selects the least important neuron from the entire model and prunes it. The latter conducts pruning layer by layer. Recently, it has turned out that some layer-wise methods are effective for reducing computational complexity of pruned models while preserving their accuracy. The difficulty of layer-wise pruning is how to adjust pruning ratio (the ratio of neurons to be pruned) in each layer. Because CNNs typically have lots of layers composed of lots of neurons, it is inefficient to tune pruning ratios by human hands. In this paper, we present Pruning Ratio Optimizer (PRO), a method that can be combined with layer-wise pruning methods for optimizing pruning ratios. The idea of PRO is to adjust pruning ratios based on how much pruning in each layer has an impact on the outputs in the final layer. In the experiments, we could verify the effectiveness of PRO.
Syful Islam, Dong Wang, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto
IEICE Transactions on Information and Systems, pp 11-18;

Third-party package usage has become a common practice in contemporary software development. Developers often face different challenges, including choosing the right libraries, installing errors, discrepancies, setting up the environment, and building failures during software development. The risks of maintaining a third-party package are well known, but it is unclear how information from Stack Overflow (SO) can be useful. This paper performed an empirical study to explore npm package co-usage examples from SO. From over 30,000 SO question posts, we extracted 2,100 posts with package usage information and matched them against the 217,934 npm library package. We find that, popular and highly used libraries are not discussed as often in SO. However, we can see that the accepted answers may prove useful, as we believe that the usage examples and executable commands could be reused for tool support.
Zihao Song, Peng Song, Chao Sheng, Wenming Zheng, Wenjing Zhang, Shaokai Li
IEICE Transactions on Information and Systems, pp 175-179;

Unsupervised Feature selection is an important dimensionality reduction technique to cope with high-dimensional data. It does not require prior label information, and has recently attracted much attention. However, it cannot fully utilize the discriminative information of samples, which may affect the feature selection performance. To tackle this problem, in this letter, we propose a novel discriminative virtual label regression method (DVLR) for unsupervised feature selection. In DVLR, we develop a virtual label regression function to guide the subspace learning based feature selection, which can select more discriminative features. Moreover, a linear discriminant analysis (LDA) term is used to make the model be more discriminative. To further make the model be more robust and select more representative features, we impose the 2,1-norm on the regression and feature selection terms. Finally, extensive experiments are carried out on several public datasets, and the results demonstrate that our proposed DVLR achieves better performance than several state-of-the-art unsupervised feature selection methods.
Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Hiroto Ashikawa, Tetsunori Kobayashi, Tetsuji Ogawa
IEICE Transactions on Information and Systems, pp 150-160;

Most conventional multi-source domain adaptation techniques for recurrent neural network language models (RNNLMs) are domain-centric. In these approaches, each domain is considered independently and this makes it difficult to apply the models to completely unseen target domains that are unobservable during training. Instead, our study exploits domain attributes, which represent common knowledge among such different domains as dialects, types of wordings, styles, and topics, to achieve domain generalization that can robustly represent unseen target domains by combining the domain attributes. To achieve attribute-based domain generalization system in language modeling, we introduce domain attribute-based experts to a multi-stream RNNLM called recurrent adaptive mixture model (RADMM) instead of domain-based experts. In the proposed system, a long short-term memory is independently trained on each domain attribute as an expert model. Then by integrating the outputs from all the experts in response to the context-dependent weight of the domain attributes of the current input, we predict the subsequent words in the unseen target domain and exploit the specific knowledge of each domain attribute. To demonstrate the effectiveness of our proposed domain attributes-centric language model, we experimentally compared the proposed model with conventional domain-centric language model by using texts taken from multiple domains including different writing styles, topics, dialects, and types of wordings. The experimental results demonstrated that lower perplexity can be achieved using domain attributes.
Kangbo Sun, Jie Zhu
IEICE Transactions on Information and Systems, pp 141-149;

Local discriminative regions play important roles in fine-grained image analysis tasks. How to locate local discriminative regions with only category label and learn discriminative representation from these regions have been hot spots. In our work, we propose Searching Discriminative Regions (SDR) and Learning Discriminative Regions (LDR) method to search and learn local discriminative regions in images. The SDR method adopts attention mechanism to iteratively search for high-response regions in images, and uses this as a clue to locate local discriminative regions. Moreover, the LDR method is proposed to learn compact within category and sparse between categories representation from the raw image and local images. Experimental results show that our proposed approach achieves excellent performance in both fine-grained image retrieval and classification tasks, which demonstrates its effectiveness.
Kaiho Fukuchi, Hiroshi Yamada
IEICE Transactions on Information and Systems, pp 92-104;

In infrastructure-as-a-service platforms, cloud users can adjust their database (DB) service scale to dynamic workloads by changing the number of virtual machines running a DB management system (DBMS), called DBMS instances. Replicating a DBMS instance is a non-trivial task since DBMS replication is time-consuming due to the trend that cloud vendors offer high-spec DBMS instances. This paper presents BalenaDB, which performs urgent DBMS replication for handling sudden workload increases. Unlike convectional replication schemes that implicitly assume DBMS replicas are generated on remote machines, BalenaDB generates a warmed-up DBMS replica on an instance running on the local machine where the master DBMS instance runs, by leveraging the master DBMS resources. We prototyped BalenaDB on MySQL 5.6.21, Linux 3.17.2, and Xen 4.4.1. The experimental results show that the time for generating the warmed-up DBMS replica instance on BalenaDB is up to 30× shorter than an existing DBMS instance replication scheme, achieving significantly efficient memory utilization.
Kiyoharu Aizawa
IEICE Transactions on Information and Systems, pp 38-45;

This paper introduces our work on a Movie Map, which will enable users to explore a given city area using 360° videos. Visual exploration of a city is always needed. Nowadays, we are familiar with Google Street View (GSV) that is an interactive visual map. Despite the wide use of GSV, it provides sparse images of streets, which often confuses users and lowers user satisfaction. Forty years ago, a video-based interactive map was created - it is well-known as Aspen Movie Map. Movie Map uses videos instead of sparse images and seems to improve the user experience dramatically. However, Aspen Movie Map was based on analog technology with a huge effort and never built again. Thus, we renovate the Movie Map using state-of-the-art technology. We build a new Movie Map system with an interface for exploring cities. The system consists of four stages; acquisition, analysis, management, and interaction. After acquiring 360° videos along streets in target areas, the analysis of videos is almost automatic. Frames of the video are localized on the map, intersections are detected, and videos are segmented. Turning views at intersections are synthesized. By connecting the video segments following the specified movement in an area, we can watch a walking view along a street. The interface allows for easy exploration of a target area. It can also show virtual billboards in the view.
Bodin Chinthanet, Raula Gaikovina Kula, Rodrigo Eliza Zapata, Takashi Ishio, Kenichi Matsumoto, Akinori Ihara
IEICE Transactions on Information and Systems, pp 19-20;

It has become common practice for software projects to adopt third-party dependencies. Developers are encouraged to update any outdated dependency to remain safe from potential threats of vulnerabilities. In this study, we present an approach to aid developers show whether or not a vulnerable code is reachable for JavaScript projects. Our prototype, SōjiTantei, is evaluated in two ways (i) the accuracy when compared to a manual approach and (ii) a larger-scale analysis of 780 clients from 78 security vulnerability cases. The first evaluation shows that SōjiTantei has a high accuracy of 83.3%, with a speed of less than a second analysis per client. The second evaluation reveals that 68 out of the studied 78 vulnerabilities reported having at least one clean client. The study proves that automation is promising with the potential for further improvement.
Miho Yamakura, Ryousei Takano, Akram BEN Ahmed, Midori Sugaya, Hideharu Amano
IEICE Transactions on Information and Systems, pp 2078-2088;

FPGA (Field Programmable Gate Array) based accelerators are attracting significant interest in cloud computing systems. Combining multi-FPGA systems with cloud computing brings a new perspective to the reconfigurable computing research. However, the multi-tenancy of a multi-FPGA system has not been fully discussed in the previous researches. In this paper, we propose a multi-tenant resource management system, named FiC-RM, for a multi-FPGA cloud system. FiC-RM provides users with a set of FPGA resources according to their requirements and allows them to exclusively access FPGA boards and the interconnection network. To achieve this, we propose a placement algorithm which is a key to efficiently share the limited resources. We demonstrate FiC-RM controls a practical scale multi-FPGA system. Moreover, Our simulation study shows that our placement algorithm achieved 3 to 4% improvement in the average resource usage and a 20-second reduction in the response time, compared to other existing naive algorithms.
Hiroki Okada, Masato Yoshimi, Celimuge Wu, Tsutomu Yoshinaga
IEICE Transactions on Information and Systems, pp 2121-2130;

In this study, we propose a mechanism called adaptive failsoft control to address peak traffic in mobile live streaming, using a chasing playback function. Although a cache system is avaliable to support the chasing playback function for live streaming in a base station and device-to-device communication, the request concentration by highlight scenes influences the traffic load owing to data unavailability. To avoid data unavailability, we adapted two live streaming features: (1) streaming data while switching the video quality, and (2) time variability of the number of requests. The second feature enables a fallback mechanism for the cache system by prioritizing cache eviction and terminating the transfer of cache-missed requests. This paper discusses the simulation results of the proposed mechanism, which adopts a request model appropriate for (a) avoiding peak traffic and (b) maintaining continuity of service.
Tomoya Itsubo, Michihiro Koibuchi, Hideharu Amano, Hiroki Matsutani
IEICE Transactions on Information and Systems, pp 2057-2067;

Since deep learning workloads perform a large number of matrix operations on training data, GPUs (Graphics Processing Units) are efficient especially for the training phase. A cluster of computers each of which equips multiple GPUs can significantly accelerate the deep learning workloads. More specifically, a back-propagation algorithm following a gradient descent approach is used for the training. Although the gradient computation is still a major bottleneck of the training, gradient aggregation and optimization impose both communication and computation overheads, which should also be reduced for further shortening the training time. To address this issue, in this paper, multiple GPUs are interconnected with a PCI Express (PCIe) over 10Gbit Ethernet (10GbE) technology. Since these remote GPUs are interconnected with network switches, gradient aggregation and optimizers (e.g., SGD, AdaGrad, Adam, and SMORMS3) are offloaded to FPGA-based 10GbE switches between remote GPUs; thus, the gradient aggregation and parameter optimization are completed in the network. The proposed FPGA-based 10GbE switches with the four optimizers are implemented on NetFPGA-SUME board. Their resource utilizations are increased by PEs for the optimizers, and they consume up to 56% of the resources. Evaluation results using four remote GPUs connected via the proposed FPGA-based switch demonstrate that these optimizers are accelerated by up to 3.0x and 1.25x compared to CPU and GPU implementations, respectively. Also, the gradient aggregation throughput by the FPGA-based switch achieves up to 98.3% of the 10GbE line rate.
Yuki Kajiwara, Junjun Zheng, Koichi Mouri
IEICE Transactions on Information and Systems, pp 2173-2183;

The number of malware, including variants and new types, is dramatically increasing over the years, posing one of the greatest cybersecurity threats nowadays. To counteract such security threats, it is crucial to detect malware accurately and early enough. The recent advances in machine learning technology have brought increasing interest in malware detection. A number of research studies have been conducted in the field. It is well known that malware detection accuracy largely depends on the training dataset used. Creating a suitable training dataset for efficient malware detection is thus crucial. Different works usually use their own dataset; therefore, a dataset is only effective for one detection method, and strictly comparing several methods using a common training dataset is difficult. In this paper, we focus on how to create a training dataset for efficiently detecting malware. To achieve our goal, the first step is to clarify the information that can accurately characterize malware. This paper concentrates on threads, by treating them as important information for characterizing malware. Specifically, on the basis of the dynamic analysis log from the Alkanet, a system call tracer, we obtain the thread information and classify the thread information processing into four patterns. Then the malware detection is performed using the number of transitions of system calls appearing in the thread as a feature. Our comparative experimental results showed that the primary thread information is important and useful for detecting malware with high accuracy.
Ran Li, Huibiao Zhu, Jiaqi Yin
IEICE Transactions on Information and Systems, pp 2154-2163;

Ceph is an object-based parallel distributed file system that provides excellent performance, reliability, and scalability. Additionally, Ceph provides its Cephx authentication system to authenticate users, so that it can identify users and realize authentication. In this paper, we first model the basic architecture of Ceph using process algebra CSP (Communicating Sequential Processes). With the help of the model checker PAT (Process Analysis Toolkit), we feed the constructed model to PAT and then verify several related properties, including Deadlock Freedom, Data Reachability, Data Write Integrity, Data Consistency and Authentication. The verification results show that the original model cannot cater to the Authentication property. Therefore, we formalize a new model of Ceph where Cephx is adopted. In the light of the new verification results, it can be found that Cephx satisfies all these properties.
Zifen He, Shouye Zhu, Ying Huang, Yinhui Zhang
IEICE Transactions on Information and Systems, pp 2237-2243;

This paper presents a novel method for weakly supervised semantic segmentation of 3D point clouds using a novel graph and edge convolutional neural network (GECNN) towards 1% and 10% point cloud with labels. Our general framework facilitates semantic segmentation by encoding both global and local scale features via a parallel graph and edge aggregation scheme. More specifically, global scale graph structure cues of point clouds are captured by a graph convolutional neural network, which is propagated from pairwise affinity representation over the whole graph established in a d-dimensional feature embedding space. We integrate local scale features derived from a dynamic edge feature aggregation convolutional neural networks that allows us to fusion both global and local cues of 3D point clouds. The proposed GECNN model is trained by using a comprehensive objective which consists of incomplete, inexact, self-supervision and smoothness constraints based on partially labeled points. The proposed approach enforces global and local consistency constraints directly on the objective losses. It inherently handles the challenges of segmenting sparse 3D point clouds with limited annotations in a large scale point cloud space. Our experiments on the ShapeNet and S3DIS benchmarks demonstrate the effectiveness of the proposed approach for efficient (within 20 epochs) learning of large scale point cloud semantics despite very limited labels.
Hongcui Wang, Pierre Roussel, Bruce Denby
IEICE Transactions on Information and Systems, pp 2209-2217;

A Silent Speech Interface (SSI) is a sensor-based, Artificial Intelligence (AI) enabled system in which articulation is performed without the use of the vocal chords, resulting in a voice interface that conserves the ambient audio environment, protects private data, and also functions in noisy environments. Though portable SSIs based on ultrasound imaging of the tongue have obtained Word Error Rates rivaling that of acoustic speech recognition, SSIs remain relegated to the laboratory due to stability issues. Indeed, reliable extraction of acoustic features from ultrasound tongue images in real-life situations has proven elusive. Recently, Representation Learning has shown considerable success in learning underlying structure in noisy, high-dimensional raw data. In its unsupervised form, Representation Learning is able to reveal structure in unlabeled data, thus greatly simplifying the data preparation task. In the present article, a 3D Convolutional Neural Network architecture is applied to unlabeled ultrasound images, and is shown to reliably predict future tongue configurations. By comparing the 3DCNN to a simple previous-frame predictor, it is possible to recognize tongue trajectories comprising transitions between regions of stability that correlate with formant trajectories in a spectrogram of the signal. Prospects for using the underlying structural representation to provide features for subsequent speech processing tasks are presented.
Ryoma Senda, Yoshiaki Takata, Hiroyuki Seki
IEICE Transactions on Information and Systems, pp 2131-2144;

A pushdown system (PDS) is known as an abstract model of recursive programs. For PDS, model checking methods have been studied and applied to various software verification such as interprocedural data flow analysis and malware detection. However, PDS cannot manipulate data values from an infinite domain. A register PDS (RPDS) is an extension of PDS by adding registers to deal with data values in a restricted way. This paper proposes algorithms for LTL model checking problems for RPDS with simple and regular valuations, which are labelings of atomic propositions to configurations with reasonable restriction. First, we introduce RPDS and related models, and then define the LTL model checking problems for RPDS. Second, we give algorithms for solving these problems and also show that the problems are EXPTIME-complete. As practical examples, we show solutions of a malware detection and an XML schema checking in the proposed framework.
Wenyi Ge, Yi Lin, Zhitao Wang, Guigui Wang, Shihan Tan
IEICE Transactions on Information and Systems, pp 2218-2225;

In this paper, we present a simple yet powerful deep neural network for natural image dehazing. The proposed method is designed based on U-Net architecture and we made some design changes to make it better. We first use Group Normalization to replace Batch Normalization to solve the problem of insufficient batch size due to hardware limitations. Second, we introduce FReLU activation into the U-Net block, which can achieve capturing complicated visual layouts with regular convolutions. Experimental results on public benchmarks demonstrate the effectiveness of the modified components. On the SOTS Indoor and Outdoor datasets, it obtains PSNR of 32.23 and 31.64 respectively, which are comparable performances with state-of-the-art methods. The code is publicly available online soon.
Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura
IEICE Transactions on Information and Systems, pp 2195-2208;

Real-time machine speech translation systems mimic human interpreters and translate incoming speech from a source language to the target language in real-time. Such systems can be achieved by performing low-latency processing in ASR (automatic speech recognition) module before passing the output to MT (machine translation) and TTS (text-to-speech synthesis) modules. Although several studies recently proposed sequence mechanisms for neural incremental ASR (ISR), these frameworks have a more complicated training mechanism than the standard attention-based ASR because they have to decide the incremental step and learn the alignment between speech and text. In this paper, we propose attention-transfer ISR (AT-ISR) that learns the knowledge from attention-based non-incremental ASR for a low delay end-to-end speech recognition. ISR comes with a trade-off between delay and performance, so we investigate how to reduce AT-ISR delay without a significant performance drop. Our experiment shows that AT-ISR achieves a comparable performance to the non-incremental ASR when the incremental recognition begins after the speech utterance reaches 25% of the complete utterance length. Additional experiments to investigate the effect of ISR on translation tasks are also performed. The focus is to find the optimum granularity of the output unit. The results reveal that our end-to-end subword-level ISR resulted in the best translation quality with the lowest WER and the lowest uncovered-word rate.
Weijun Liu
IEICE Transactions on Information and Systems, pp 2145-2153;

Computing the Lempel-Ziv Factorization (LZ77) of a string is one of the most important problems in computer science. Nowadays, it has been widely used in many applications such as data compression, text indexing and pattern discovery, and already become the heart of many file compressors like gzip and 7zip. In this paper, we show a linear time algorithm called Xone for computing the LZ77, which has the same space requirement with the previous best space requirement for linear time LZ77 factorization called BGone. Xone greatly improves the efficiency of BGone. Experiments show that the two versions of Xone: XoneT and XoneSA are about 27% and 31% faster than BGoneT and BGoneSA, respectively.
Ryosuke Kuramochi, Hiroki Nakahara
IEICE Transactions on Information and Systems, pp 2068-2077;

Convolutional neural networks (CNNs) are widely used for image processing tasks in both embedded systems and data centers. In data centers, high accuracy and low latency are desired for various tasks such as image processing of streaming videos. We propose an FPGA-based low-latency CNN inference for randomly wired convolutional neural networks (RWCNNs), whose layer structures are based on random graph models. Because RWCNNs have several convolution layers that have no direct dependencies between them, our architecture can process them efficiently using a pipeline method. At each layer, we need to use the calculation results of multiple layers as the input. We use an FPGA with HBM2 to enable parallel access to the input data with multiple HBM2 channels. We schedule the order of execution of the layers to improve the pipeline efficiency. We build a conflict graph using the scheduling results. Then, we allocate the calculation results of each layer to the HBM2 channels by coloring the graph. Because the pipeline execution needs to be properly controlled, we developed an automatic generation tool for hardware functions. We implemented the proposed architecture on the Alveo U50 FPGA. We investigated a trade-off between latency and recognition accuracy for the ImageNet classification task by comparing the inference performances for different input image sizes. We compared our accelerator with a conventional accelerator for ResNet-50. The results show that our accelerator reduces the latency by 2.21 times. We also obtained 12.6 and 4.93 times better efficiency than CPU and GPU, respectively. Thus, our accelerator for RWCNNs is suitable for low-latency inference.
Ruicong Zhi, Caixia Zhou, Junwei Yu, Tingting Li, Ghada Zamzmi
IEICE Transactions on Information and Systems, pp 2184-2194;

Pain is an essential physiological phenomenon of human beings. Accurate assessment of pain is important to develop proper treatment. Although self-report method is the gold standard in pain assessment, it is not applicable to individuals with communicative impairment. Non-verbal pain indicators such as pain related facial expressions and changes in physiological parameters could provide valuable insights for pain assessment. In this paper, we propose a multimodal-based Stream Integrated Neural Network with Different Frame Rates (SINN) that combines facial expression and biomedical signals for automatic pain assessment. The main contributions of this research are threefold. (1) There are four-stream inputs of the SINN for facial expression feature extraction. The variant facial features are integrated with biomedical features, and the joint features are utilized for pain assessment. (2) The dynamic facial features are learned in both implicit and explicit manners to better represent the facial changes that occur during pain experience. (3) Multiple modalities are utilized to identify various pain states, including facial expression and biomedical signals. The experiments are conducted on publicly available pain datasets, and the performance is compared with several deep learning models. The experimental results illustrate the superiority of the proposed model, and it achieves the highest accuracy of 68.2%, which is up to 5% higher than the basic deep learning models on pain assessment with binary classification.
Yuto Jumonji, Hiroshi Yamada
IEICE Transactions on Information and Systems, pp 2164-2172;

Reboot-based recovery is a simple but powerful method to recover applications from failures and unstable states. Reboot-based recovery faces a challenge to apply it to a new type of applications, in-memory databases (DBs). Unlike legacy applications, since rebooting in-memory DBs loses memory objects including key-value pairs and DB blocks, it is required to restore them, causing severe performance degradation after the reboot. This paper presents an approach that allows us to perform reboot-based recovery of in-memory DBs with lower performance degradation. Our key insight is to decouple data content objects from all the memory objects. Our approach treats data items as data content objects, preserves data content objects on memory across reboots, and enforces restarted in-memory DBs to attach them. To show the effectiveness of our approach, we elaborate the idea into two real-world DBs, MyRocks and memcached. The prototypes successfully mitigate performance degradation after their reboot-based recovery.
Akira Jinguji, Shimpei Sato, Hiroki Nakahara
IEICE Transactions on Information and Systems, pp 2040-2047;

Convolutional neural network (CNN) has a high recognition rate in image recognition and are used in embedded systems such as smartphones, robots and self-driving cars. Low-end FPGAs are candidates for embedded image recognition platforms because they achieve real-time performance at a low cost. However, CNN has significant parameters called weights and internal data called feature maps, which pose a challenge for FPGAs for performance and memory capacity. To solve these problems, we exploit a split-CNN and weight sparseness. The split-CNN reduces the memory footprint by splitting the feature map into smaller patches and allows the feature map to be stored in the FPGA's high-throughput on-chip memory. Weight sparseness reduces computational costs and achieves even higher performance. We designed a dedicated architecture of a sparse CNN and a memory buffering scheduling for a split-CNN and implemented this on the PYNQ-Z1 FPGA board with a low-end FPGA. An experiment on classification using VGG16 shows that our implementation is 3.1 times faster than the GPU, and 5.4 times faster than an existing FPGA implementation.
Koki Higashi, Yoichi Ishiwata, Takeshi Ohkawa, Midori Sugaya
IEICE Transactions on Information and Systems, pp 2097-2108;

Recently, edge servers located closer than the cloud have become expected for the purpose of processing the large amount of sensor data generated by IoT devices such as robots. Research has been proposed to improve responsiveness as a cache server by applying KVS (Key-Value Store) to the edge as a method for obtaining high responsiveness. Above all, a hybrid-KVS server that uses both DRAM and NVMM (Non-Volatile Main Memory) devices is expected to achieve both responsiveness and reliability. However, its effectiveness has not been verified in actual applications, and its effectiveness is not clear in terms of its relationship with the cloud. The purpose of this study is to evaluate the effectiveness of hybrid-KVS servers using the SLAM (Simultaneous Localization and Mapping), which is a widely used application in robots and autonomous driving. It is appropriate for applying an edge server and requires responsiveness and reliability. SLAM is generally implemented on ROS (Robot Operating System) middleware and communicates with the server through ROS middleware. However, if we use hybrid-KVS on the edge with the SLAM and ROS, the communication could not be achieved since the message objects are different from the format expected by KVS. Therefore, in this research, we propose a mechanism to apply the ROS memory object to hybrid-KVS by designing and implementing the data serialization function to extend ROS. As a result of the proposed fogcached-ros and evaluation, we confirm the effectiveness of low API overhead, support for data used by SLAM, and low latency difference between the edge and cloud.
Uuganbayar Ganbold, Junya Sato, Takuya Akashi
IEICE Transactions on Information and Systems, pp 2226-2236;

Horizon detection is useful in maritime image processing for various purposes, such as estimation of camera orientation, registration of consecutive frames, and restriction of the object search region. Existing horizon detection methods are based on edge extraction. For accuracy, they use multiple images, which are filtered with different filter sizes. However, this increases the processing time. In addition, these methods are not robust to blurting. Therefore, we developed a horizon detection method without extracting the candidates from the edge information by formulating the horizon detection problem as a global optimization problem. A horizon line in an image plane was represented by two parameters, which were optimized by an evolutionary algorithm (genetic algorithm). Thus, the local and global features of a horizon were concurrently utilized in the optimization process, which was accelerated by applying a coarse-to-fine strategy. As a result, we could detect the horizon line on high-resolution maritime images in about 50ms. The performance of the proposed method was tested on 49 videos of the Singapore marine dataset and the Buoy dataset, which contain over 16000 frames under different scenarios. Experimental results show that the proposed method can achieve higher accuracy than state-of-the-art methods.
Kazuichi Oe, Takeshi Nanri
IEICE Transactions on Information and Systems, pp 2109-2120;

Hybrid storage techniques are useful methods to improve the cost performance for input-output (IO) intensive workloads. These techniques choose areas of concentrated IO accesses and migrate them to an upper tier to extract as much performance as possible through greater use of upper tier areas. Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system situated between non-volatile memories (NVMs) and solid-state drives (SSDs). ATSMF aims to reduce the average response time for IO accesses by migrating areas of concentrated IO access from an SSD to an NVM. When a concentrated IO access finishes, the system migrates these areas from the NVM back to the SSD. Unfortunately, the published ATSMF implementation temporarily consumes much NVM capacity upon migrating concentrated IO access areas to NVM, because its algorithm executes NVM migration with high priority. As a result, it often delays evicting areas in which IO concentrations have ended to the SSD. Therefore, to reduce the consumption of NVM while maintaining the average response time, we developed new techniques for making ATSMF more practical. The first is a queue handling technique based on the number of IO accesses for NVM migration and eviction. The second is an eviction method that selects only write-accessed partial regions in finished areas. The third is a technique for variable eviction timing to balance the NVM consumption and average response time. Experimental results indicate that the average response times of the proposed ATSMF are almost the same as those of the published ATSMF, while the NVM consumption is three times lower in best case.
Kouki Ozawa, Takahiro Hirofuchi, Ryousei Takano, Midori Sugaya
IEICE Transactions on Information and Systems, pp 2089-2096;

With the development of IoT devices and sensors, edge computing is leading towards new services like autonomous cars and smart cities. Low-latency data access is an essential requirement for such services, and a large-capacity cache server is needed on the edge side. However, it is not realistic to build a large capacity cache server using only DRAM because DRAM is expensive and consumes substantially large power. A hybrid main memory system is promising to address this issue, in which main memory consists of DRAM and non-volatile memory. It achieves a large capacity of main memory within the power supply capabilities of current servers. In this paper, we propose Fogcached, that is, the extension of a widely-used KVS (Key-Value Store) server program (i.e., Memcached) to exploit both DRAM and non-volatile main memory (NVMM). We used Intel Optane DCPM as NVMM for its prototype. Fogcached implements a Dual-LRU (Least Recently Used) mechanism that seamlessly extends the memory management of Memcached to hybrid main memory. Fogcached reuses the segmented LRU of Memcached to manage cached objects in DRAM, adds another segmented LRU for those in DCPM and bridges the LRUs by a mechanism to automatically replace cached objects between DRAM and DCPM. Cached objects are autonomously moved between the two memory devices according to their access frequencies. Through experiments, we confirmed that Fogcached improved the peak value of a latency distribution by about 40% compared to Memcached.
Koki Honda, Kaijie Wei, Masatoshi Arai, Hideharu Amano
IEICE Transactions on Information and Systems, pp 2048-2056;

Automobile companies have been trying to replace side mirrors of cars with small cameras for reducing air resistance. It enables us to apply some image processing to improve the quality of the image. Contrast Limited Adaptive Histogram Equalization (CLAHE) is one of such techniques to improve the quality of the image for the side mirror camera, which requires a large computation performance. Here, an implementation method of CLAHE on a low-end FPGA board by high-level synthesis is proposed. CLAHE has two main processing parts: cumulative distribution function (CDF) generation, and bilinear interpolation. During the CDF generation, the effect of increasing loop initiation interval can be greatly reduced by placing multiple Processing Elements (PEs). and during the interpolation, latency and BRAM usage were reduced by revising how to hold CDF and calculation method. Finally, by connecting each module with streaming interfaces, using data flow pragmas, overlapping processing, and hiding data transfer, our HLS implementation achieved a comparable result to that of HDL. We parameterized the components of the algorithm so that the number of tiles and the size of the image can be easily changed. The source code for this research can be downloaded from
Sang-Hoon Kim
IEICE Transactions on Information and Systems, pp 2244-2247;

There have been increasing demands for distributed operating systems to better utilize scattered resources over multiple nodes. This paper enlightens the challenges and requirements for the communication layers for distributed operating systems, and makes a case for a versatile, high-performance communication layer over InfiniBand network.
Kohei Ito, Kensuke Iizuka, Kazuei Hironaka, Yao Hu, Michihiro Koibuchi, Hideharu Amano
IEICE Transactions on Information and Systems, pp 2029-2039;

Multi-FPGA systems have gained attention because of their high performance and power efficiency. A multi-FPGA system called Flow-in-Cloud (FiC) is currently being developed as an accelerator of multi-access edge computing (MEC). FiC consists of multiple mid-range FPGAs tightly connected by high-speed serial links. Since time-critical jobs are assumed in MEC, a circuit-switched network with static time-division multiplexing (STDM) switches has been implemented on FiC. This paper investigates techniques of enhancing the interconnection performance of FiC. Unlike switching fabrics for Network on Chips or parallel machines, economical multi-FPGA systems, such as FiC, use Xilinx Aurora IP and FireFly cables with multiple lanes. We adopted the link aggregation and the slot distribution for using multiple lanes. To mitigate the bottleneck between an STDM switch and user logic, we also propose a multi-ejection STDM switch. We evaluated various combinations of our techniques by using three practical applications on an FiC prototype with 24 boards. When the number of slots is large and transferred data size is small, the slot distribution was sometimes more effective, while the link aggregation was superior for other most cases. Our multi-ejection STDM switch mitigated the bottleneck in ejection ports and successfully reduced the number of time slots. As a result, by combining the link aggregation and multi-ejection STDM switch, communication performance improved up to 7.50 times with few additional resources. Although the performance of the fast Fourier transform with the highest communication ratio could not be enhanced by using multiple boards when a lane was used, 1.99 times performance improvement was achieved by using 8 boards with four lanes and our multi-ejection switch compared with a board.
Kang Woo Cho, Byeong-Gyu Jeong, Sang Uk Shin
IEICE Transactions on Information and Systems, pp 1857-1868;

The continuous development of the mobile computing environment has led to the emergence of fintech to enable convenient financial transactions in this environment. Previously proposed financial identity services mostly adopted centralized servers that are prone to single-point-of-failure problems and performance bottlenecks. Blockchain-based self-sovereign identity (SSI), which emerged to address this problem, is a technology that solves centralized problems and allows decentralized identification. However, the verifiable credential (VC), a unit of SSI data transactions, guarantees unlimited right to erasure for self-sovereignty. This does not suit the specificity of the financial transaction network, which requires the restriction of the right to erasure for credit evaluation. This paper proposes a model for VC generation and revocation verification for credit scoring data. The proposed model includes double zero knowledge - succinct non-interactive argument of knowledge (zk-SNARK) proof in the VC generation process between the holder and the issuer. In addition, cross-revocation verification takes place between the holder and the verifier. As a result, the proposed model builds a trust platform among the holder, issuer, and verifier while maintaining the decentralized SSI attributes and focusing on the VC life cycle. The model also improves the way in which credit evaluation data are processed as VCs by granting opt-in and the special right to erasure.
Page of 122
Articles per Page
Show export options
  Select all
Back to Top Top