Latest articles in this journal
AI Open, Volume 2, pp 186-196; https://doi.org/10.1016/j.aiopen.2021.09.003
Knowledge graph (KG) embedding models suffer from the incompleteness issue of observed facts. Different from existing solutions that incorporate additional information or employ expressive and complex embedding techniques, we propose to augment KGs by iteratively mining logical rules from the observed facts and then using the rules to generate new relational triples. We incrementally train KG embeddings with the coming of new augmented triples, and leverage the embeddings to validate these new triples. To guarantee the quality of the augmented data, we filter out the noisy triples based on a propagation mechanism during the validation. The mined rules and rule groundings are human-understandable, and can make the augmentation procedure reliable. Our KG augmentation framework is applicable to any KG embedding models with no need to modify their embedding techniques. Our experiments on two popular embedding-based tasks (i.e., entity alignment and link prediction) show that the proposed framework can bring significant improvement to existing KG embedding models on most benchmark datasets.
AI Open, Volume 2, pp 175-185; https://doi.org/10.1016/j.aiopen.2021.09.002
Extracting entity and relation jointly is often complicated since the relational triplets may be overlapped. In this paper, we propose a novel unified joint extraction model that considers the significant information which is useful for relation extraction between a pair of entities. We also consider bidirectional interaction between named entity recognition and relation extraction. To this end, we apply Bi-LSTM to capture sequential information and use Graph Convolutional Network to capture significant regional information in our encoding part. We use multi-layer structure in decoding part including first decode layer, interactive layer and final decode layer to fuse bidirectional interactive information between named entity recognition and relation extraction. In this way, our method can simultaneously extract all entities and their relations including overlapping relations. Experimental results show that our model performs better comparing with other baseline models in this task, and we achieve state-of-the-art performance on two public datasets.
We focus on the task of stock market prediction based on financial text which contains information that could influence the movement of stock market. Previous works mainly utilize a single semantic unit of financial text, such as words, events, sentences, to predict the tendency of stock market. However, the interaction of different-grained information within financial text can be useful for context knowledge supplement and predictive information selection, and then improve the performance of stock market prediction. To facilitate this, we propose constructing a heterogeneous graph with different-grained information nodes from financial text for the task. A novel heterogeneous neural network is presented to aggregate multi-grained information. Experimental results demonstrate that our proposed approach reaches higher performance than baselines.
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI). Owing to sophisticated pre-training objectives and huge model parameters, large-scale PTMs can effectively capture knowledge from massive labeled and unlabeled data. By storing knowledge into huge parameters and fine-tuning on specific tasks, the rich knowledge implicitly encoded in huge parameters can benefit a variety of downstream tasks, which has been extensively demonstrated via experimental verification and empirical analysis. It is now the consensus of the AI community to adopt PTMs as backbone for downstream tasks rather than learning models from scratch. In this paper, we take a deep look into the history of pre-training, especially its special relation with transfer learning and self-supervised learning, to reveal the crucial position of PTMs in the AI development spectrum. Further, we comprehensively review the latest breakthroughs of PTMs. These breakthroughs are driven by the surge of computational power and the increasing availability of data, towards four important directions: designing effective architectures, utilizing rich contexts, improving computational efficiency, and conducting interpretation and theoretical analysis. Finally, we discuss a series of open problems and research directions of PTMs, and hope our view can inspire and advance the future study of PTMs.
AI Open, Volume 2, pp 160-167; https://doi.org/10.1016/j.aiopen.2021.08.001
Graph classification is a highly impactful task that plays a crucial role in a myriad of real-world applications such as molecular property prediction and protein function prediction. Aiming to handle the new classes with limited labeled graphs, few-shot graph classification has become a bridge of existing graph classification solutions and practical usage. This work explores the potential of metric-based meta-learning for solving few-shot graph classification. We highlight the importance of considering structural characteristics in the solution and propose a novel framework which explicitly considers global structure and local structure of the input graph. An implementation upon GIN, named SMF-GIN, is tested on two datasets, Chembl and TRIANGLES, where extensive experiments validate the effectiveness of the proposed method. The Chembl is constructed to fill in the gap of lacking large-scale benchmark for few-shot graph classification evaluation, which is released together with the implementation of SMF-GIN at: https://github.com/jiangshunyu/SMF-GIN.
AI Open, Volume 2, pp 135-142; https://doi.org/10.1016/j.aiopen.2021.07.003
Information propagation models in the Weibo network play a primary role in analyzing user behaviors, obtaining the propagation paths, determining the opinion leaders, and discovering the hot spots of public opinion. Existing research recognizes the critical role played by information propagation models from different aspects. However, few studies have investigated the specific details of information propagation in any systematic way. Spiking neural P (SNP, for short) systems are one of the most potential research carriers of information propagation by applying their concurrent structures and asynchronous firing rules. This paper proposes a simple and intuitive SNP variant, namely DWIP-SNP, for user behavior analysis in Weibo. The fundamental objects of information propagation in Weibo are represented by a similar SNP formalization. The forward, comment, delete, and other users’ behaviors in the Weibo network can be observed and proceeded more intuitively. Then, the DWIP-SNP systems are combined with time delays to indicate the dynamic information diffusion from the perspective of the Bio-computing systems. Finally, a real-world example of information propagation with Weibo data set is utilized to verify the effectiveness and feasibility of the model. The insights of the DWIP-SNP based propagation model gained from this study may be of assistance to user behavior understanding and information propagation in other complex networks.
AI Open, Volume 2, pp 100-126; https://doi.org/10.1016/j.aiopen.2021.06.002
Recommender systems exploit interaction history to estimate user preference, having been heavily used in a wide range of industry applications. However, static recommendation models are difficult to answer two important questions well due to inherent shortcomings: (a) What exactly does a user like? (b) Why does a user like an item? The shortcomings are due to the way that static models learn user preference, i.e., without explicit instructions and active feedback from users. The recent rise of conversational recommender systems (CRSs) changes this situation fundamentally. In a CRS, users and the system can dynamically communicate through natural language interactions, which provide unprecedented opportunities to explicitly obtain the exact preference of users. Considerable efforts, spread across disparate settings and applications, have been put into developing CRSs. Existing models, technologies, and evaluation methods for CRSs are far from mature. In this paper, we provide a systematic review of the techniques used in current CRSs. We summarize the key challenges of developing CRSs in five directions: (1) Question-based user preference elicitation. (2) Multi-turn conversational recommendation strategies. (3) Dialogue understanding and generation. (4) Exploitation-exploration trade-offs. (5) Evaluation and user simulation. These research directions involve multiple research fields like information retrieval (IR), natural language processing (NLP), and human-computer interaction (HCI). Based on these research directions, we discuss some future challenges and opportunities. We provide a road map for researchers from multiple communities to get started in this area. We hope this survey can help to identify and address challenges in CRSs and inspire future research.
AI Open, Volume 2, pp 93-99; https://doi.org/10.1016/j.aiopen.2021.07.001
Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best of our knowledge, CPM, with 2.6 billion parameters and 100 GB Chinese training data, is the largest Chinese pre-trained language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation, cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many NLP tasks in the settings of few-shot (even zero-shot) learning. The code and parameters are available at https://github.com/TsinghuaAI/CPM.
AI Open, Volume 2, pp 127-134; https://doi.org/10.1016/j.aiopen.2021.06.004
Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs), and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs (“knowledge context”), regardless of that the knowledge required by PLMs may change dynamically according to specific text (“textual context”). In this paper, we propose a novel framework named Coketo dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Cokeoutperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Cokecan describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our implementation and datasets are publicly available.1
AI Open, Volume 2, pp 69-78; https://doi.org/10.1016/j.aiopen.2021.05.002
Machine learning (ML) technologies have achieved significant success in various downstream tasks, e.g., node classification, link prediction, community detection, graph classification and graph clustering. However, many studies have shown that the models built upon ML technologies are vulnerable to noises and adversarial attacks. A number of works have studied the robust models against noise or adversarial examples in image domains and text processing domains, however, it is more challenging to learn robust models in graph domains. Adding noises or perturbations on graph data will make the robustness even harder to enhance – the noises and perturbations of edges or node attributes are easy to propagate to other neighbors via the relational information on a graph. In this paper, we investigate and summarize the existing works that study the robust deep learning models against adversarial attacks or noises on graphs, namely the robust learning (models) on graphs. Specifically, we first provide some robustness evaluation metrics of model robustness on graphs. Then, we comprehensively provide a taxonomy which groups robust models on graphs into five categories: anomaly detection, adversarial training, pre-processing, attention mechanism, and certifiable robustness. Besides, we emphasize some promising future directions in learning robust models on graphs. Hopefully, our works can offer insights for the relevant researchers, thus providing assistance for their studies.