Refine Search

New Search

Results: 1,353

(searched for: publisher_group_id:10089)
Save to Scifeed
Page of 28
Articles per Page
Show export options
  Select all
Journal of Artificial Intelligence Research, Volume 32, pp 663-704;

Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.
Journal of Artificial Intelligence Research, Volume 37, pp 279-328;

The paper investigates parameterized approximate message-passing schemes that are based on bounded inference and are inspired by Pearl’s belief propagation algorithm (BP). We start with the bounded inference mini-clustering algorithm and then move to the iterative scheme called Iterative Join-Graph Propagation (IJGP), that combines both iteration and bounded inference. Algorithm IJGP belongs to the class of Generalized Belief Propagation algorithms, a framework that allowed connections with approximate algorithms from statistical physics and is shown empirically to surpass the performance of mini-clustering and belief propagation, as well as a number of other state-of-the-art algorithms on several classes of networks. We also provide insight into the accuracy of iterative BP and IJGP by relating these algorithms to well known classes of constraint propagation schemes.
Georgios Amanatidis, Georgios Birmpas, Aris Filos-Ratsikas, Alexandros A. Voudouris
Journal of Artificial Intelligence Research, Volume 74, pp 227–261-227–261;

We consider the One-Sided Matching problem, where n agents have preferences over n items, and these preferences are induced by underlying cardinal valuation functions. The goal is to match every agent to a single item so as to maximize the social welfare. Most of the related literature, however, assumes that the values of the agents are not a priori known, and only access to the ordinal preferences of the agents over the items is provided. Consequently, this incomplete information leads to loss of efficiency, which is measured by the notion of distortion. In this paper, we further assume that the agents can answer a small number of queries, allowing us partial access to their values. We study the interplay between elicited cardinal information (measured by the number of queries per agent) and distortion for One-Sided Matching, as well as a wide range of well-studied related problems. Qualitatively, our results show that with a limited number of queries, it is possible to obtain significant improvements over the classic setting, where only access to ordinal information is given.
Sandhya Saisubramanian, Ece Kamar, Shlomo Zilberstein
Journal of Artificial Intelligence Research, Volume 74, pp 143-177;

Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent’s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent’s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent’s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.
Khoi D. Hoang, Ferdinando Fioretto, Ping Hou, William Yeoh, Makoto Yokoo, Roie Zivan
Journal of Artificial Intelligence Research, Volume 74, pp 179-225;

The Distributed Constraint Optimization Problem (DCOP) formulation is a powerful tool for modeling multi-agent coordination problems. To solve DCOPs in a dynamic environment, Dynamic DCOPs (D-DCOPs) have been proposed to model the inherent dynamism present in many coordination problems. D-DCOPs solve a sequence of static problems by reacting to changes in the environment as the agents observe them. Such reactive approaches ignore knowledge about future changes of the problem. To overcome this limitation, we introduce Proactive Dynamic DCOPs (PD-DCOPs), a novel formalism to model D-DCOPs in the presence of exogenous uncertainty. In contrast to reactive approaches, PD-DCOPs are able to explicitly model possible changes of the problem and take such information into account when solving the dynamically changing problem in a proactive manner. The additional expressivity of this formalism allows it to model a wider variety of distributed optimization problems. Our work presents both theoretical and practical contributions that advance current dynamic DCOP models: (i) We introduce Proactive Dynamic DCOPs (PD-DCOPs), which explicitly model how the DCOP will change over time; (ii) We develop exact and heuristic algorithms to solve PD-DCOPs in a proactive manner; (iii) We provide theoretical results about the complexity of this new class of DCOPs; and (iv) We empirically evaluate both proactive and reactive algorithms to determine the trade-offs between the two classes. The final contribution is important as our results are the first that identify the characteristics of the problems that the two classes of algorithms excel in.
Bhaskar Ray Chaudhury, Yun Kuen Cheung, Jugal Garg, Naveen Garg, Martin Hoefer, Kurt Mehlhorn
Journal of Artificial Intelligence Research, Volume 74, pp 111-142;

We study the fair and efficient allocation of a set of indivisible goods among agents, where each good has several copies, and each agent has an additively separable concave valuation function with a threshold. These valuations capture the property of diminishing marginal returns, and they are more general than the well-studied case of additive valuations. We present a polynomial-time algorithm that approximates the optimal Nash social welfare (NSW) up to a factor of e1/e ≈ 1.445. This matches with the state-of-the-art approximation factor for additive valuations. The computed allocation also satisfies the popular fairness guarantee of envy-freeness up to one good (EF1) up to a factor of 2 + ε. For instances without thresholds, it is also approximately Pareto-optimal. For instances satisfying a large market property, we show an improved approximation factor. Lastly, we show that the upper bounds on the optimal NSW introduced in Cole and Gkatzelis (2018) and Barman et al. (2018) have the same value.
Lindsay Weinberg
Journal of Artificial Intelligence Research, Volume 74, pp 75-109;

This survey article assesses and compares existing critiques of current fairness-enhancing technical interventions in machine learning (ML) that draw from a range of non-computing disciplines, including philosophy, feminist studies, critical race and ethnic studies, legal studies, anthropology, and science and technology studies. It bridges epistemic divides in order to offer an interdisciplinary understanding of the possibilities and limits of hegemonic computational approaches to ML fairness for producing just outcomes for society’s most marginalized. The article is organized according to nine major themes of critique wherein these different fields intersect: 1) how "fairness" in AI fairness research gets defined; 2) how problems for AI systems to address get formulated; 3) the impacts of abstraction on how AI tools function and its propensity to lead to technological solutionism; 4) how racial classification operates within AI fairness research; 5) the use of AI fairness measures to avoid regulation and engage in ethics washing; 6) an absence of participatory design and democratic deliberation in AI fairness considerations; 7) data collection practices that entrench “bias,” are non-consensual, and lack transparency; 8) the predatory inclusion of marginalized groups into AI systems; and 9) a lack of engagement with AI’s long-term social and ethical outcomes. Drawing from these critiques, the article concludes by imagining future ML fairness research directions that actively disrupt entrenched power dynamics and structural injustices in society.
Sriram Ganapathi Subramanian, Matthew E. Taylor, Kate Larson, Mark Crowley
Journal of Artificial Intelligence Research, Volume 74, pp 1-74;

In the last decade, there have been significant advances in multi-agent reinforcement learning (MARL) but there are still numerous challenges, such as high sample complexity and slow convergence to stable policies, that need to be overcome before wide-spread deployment is possible. However, many real-world environments already, in practice, deploy sub-optimal or heuristic approaches for generating policies. An interesting question that arises is how to best use such approaches as advisors to help improve reinforcement learning in multi-agent domains. In this paper, we provide a principled framework for incorporating action recommendations from online suboptimal advisors in multi-agent settings. We describe the problem of ADvising Multiple Intelligent Reinforcement Agents (ADMIRAL) in nonrestrictive general-sum stochastic game environments and present two novel Q-learning based algorithms: ADMIRAL - Decision Making (ADMIRAL-DM) and ADMIRAL - Advisor Evaluation (ADMIRAL-AE), which allow us to improve learning by appropriately incorporating advice from an advisor (ADMIRAL-DM), and evaluate the effectiveness of an advisor (ADMIRAL-AE). We analyze the algorithms theoretically and provide fixed point guarantees regarding their learning in general-sum stochastic games. Furthermore, extensive experiments illustrate that these algorithms: can be used in a variety of environments, have performances that compare favourably to other related baselines, can scale to large state-action spaces, and are robust to poor advice from advisors.
Carlos Escolano, Marta Ruiz Costa-Jussà, José A. R. Fonollosa
Journal of Artificial Intelligence Research, Volume 73, pp 1535-1552;

State-of-the-art multilingual machine translation relies on a shared encoder-decoder. In this paper, we propose an alternative approach based on language-specific encoder-decoders, which can be easily extended to new languages by learning their corresponding modules. To establish a common interlingua representation, we simultaneously train N initial languages. Our experiments show that the proposed approach improves over the shared encoder-decoder for the initial languages and when adding new languages, without the need to retrain the remaining modules. All in all, our work closes the gap between shared and language-specific encoder-decoders, advancing toward modular multilingual machine translation systems that can be flexibly extended in lifelong learning settings.
Stylianos Loukas Vasileiou, William Yeoh, Tran Cao Son, Ashwin Kumar, Michael Cashmore, Dianele Magazzeni
Journal of Artificial Intelligence Research, Volume 73, pp 1473-1534;

In human-aware planning systems, a planning agent might need to explain its plan to a human user when that plan appears to be non-feasible or sub-optimal. A popular approach, called model reconciliation, has been proposed as a way to bring the model of the human user closer to the agent’s model. To do so, the agent provides an explanation that can be used to update the model of human such that the agent’s plan is feasible or optimal to the human user. Existing approaches to solve this problem have been based on automated planning methods and have been limited to classical planning problems only. In this paper, we approach the model reconciliation problem from a different perspective, that of knowledge representation and reasoning, and demonstrate that our approach can be applied not only to classical planning problems but also hybrid systems planning problems with durative actions and events/processes. In particular, we propose a logic-based framework for explanation generation, where given a knowledge base KBa (of an agent) and a knowledge base KBh (of a human user), each encoding their knowledge of a planning problem, and that KBa entails a query q (e.g., that a proposed plan of the agent is valid), the goal is to identify an explanation ε ⊆ KBa such that when it is used to update KBh, then the updated KBh also entails q. More specifically, we make the following contributions in this paper: (1) We formally define the notion of logic-based explanations in the context of model reconciliation problems; (2) We introduce a number of cost functions that can be used to reflect preferences between explanations; (3) We present algorithms to compute explanations for both classical planning and hybrid systems planning problems; and (4) We empirically evaluate their performance on such problems. Our empirical results demonstrate that, on classical planning problems, our approach is faster than the state of the art when the explanations are long or when the size of the knowledge base is small (e.g., the plans to be explained are short). They also demonstrate that our approach is efficient for hybrid systems planning problems. Finally, we evaluate the real-world efficacy of explanations generated by our algorithms through a controlled human user study, where we develop a proof-of-concept visualization system and use it as a medium for explanation communication.
Fajri Koto, Timothy Baldwin,
Journal of Artificial Intelligence Research, Volume 73;

In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences). We construct a novel dataset for focus, coverage, and inter-sentential coherence, and develop automatic methods for evaluating each of the four dimensions of FFCI based on cross-comparison of evaluation metrics and model-based evaluation methods, including question answering (QA) approaches, semantic textual similarity (STS), next-sentence prediction (NSP), and scores derived from 19 pre-trained language models. We then apply the developed metrics in evaluating a broad range of summarization models across two datasets, with some surprising findings.
Taha Belkhouja, Janardhan Rao Doppa
Journal of Artificial Intelligence Research, Volume 73, pp 1435-1471;

Time-series data arises in many real-world applications (e.g., mobile health) and deep neural networks (DNNs) have shown great success in solving them. Despite their success, little is known about their robustness to adversarial attacks. In this paper, we propose a novel adversarial framework referred to as Time-Series Attacks via STATistical Features (TSA-STAT). To address the unique challenges of time-series domain, TSA-STAT employs constraints on statistical features of the time-series data to construct adversarial examples. Optimized polynomial transformations are used to create attacks that are more effective (in terms of successfully fooling DNNs) than those based on additive perturbations. We also provide certified bounds on the norm of the statistical features for constructing adversarial examples. Our experiments on diverse real-world benchmark datasets show the effectiveness of TSA-STAT in fooling DNNs for time-series domain and in improving their robustness.
Manon Prédhumeau, Lyuba Mancheva, Julie Dugdale, Anne Spalanzani
Journal of Artificial Intelligence Research, Volume 73, pp 1385-1433;

This paper addresses modeling and simulating pedestrian trajectories when interacting with an autonomous vehicle in a shared space. Most pedestrian–vehicle interaction models are not suitable for predicting individual trajectories. Data-driven models yield accurate predictions but lack generalizability to new scenarios, usually do not run in real time and produce results that are poorly explainable. Current expert models do not deal with the diversity of possible pedestrian interactions with the vehicle in a shared space and lack microscopic validation. We propose an expert pedestrian model that combines the social force model and a new decision model for anticipating pedestrian–vehicle interactions. The proposed model integrates different observed pedestrian behaviors, as well as the behaviors of the social groups of pedestrians, in diverse interaction scenarios with a car. We calibrate the model by fitting the parameters values on a training set. We validate the model and evaluate its predictive potential through qualitative and quantitative comparisons with ground truth trajectories. The proposed model reproduces observed behaviors that have not been replicated by the social force model and outperforms the social force model at predicting pedestrian behavior around the vehicle on the used dataset. The model generates explainable and real-time trajectory predictions. Additional evaluation on a new dataset shows that the model generalizes well to new scenarios and can be applied to an autonomous vehicle embedded prediction.
Myrl G. Marmarelis, Greg Ver Steeg, Aram Galstyan
Journal of Artificial Intelligence Research, Volume 73;

A multivariate Hawkes process enables self- and cross-excitations through a triggering matrix that behaves like an asymmetrical covariance structure, characterizing pairwise interactions between the event types. Full-rank estimation of all interactions is often infeasible in empirical settings. Models that specialize on a spatiotemporal application alleviate this obstacle by exploiting spatial locality, allowing the dyadic relationships between events to depend only on separation in time and relative distances in real Euclidean space. Here we generalize this framework to any multivariate Hawkes process, and harness it as a vessel for embedding arbitrary event types in a hidden metric space. Specifically, we propose a Hidden Hawkes Geometry (HHG) model to uncover the hidden geometry between event excitations in a multivariate point process. The low dimensionality of the embedding regularizes the structure of the inferred interactions. We develop a number of estimators and validate the model by conducting several experiments. In particular, we investigate regional infectivity dynamics of COVID-19 in an early South Korean record and recent Los Angeles confirmed cases. By additionally performing synthetic experiments on short records as well as explorations into options markets and the Ebola epidemic, we demonstrate that learning the embedding alongside a point process uncovers salient interactions in a broad range of applications.
Mirosław Krzyśko, Łukasz Smaga, Piotr Kokoszka
Journal of Artificial Intelligence Research, Volume 73, pp 1355-1384;

We study the pairwise and mutual independence testing problem for multivariate functional data. Using a basis representation of functional data, we reduce this problem to testing the independence of multivariate data, which may be high-dimensional. For pairwise independence, we apply tests based on distance and Hilbert-Schmidt covariances as well as their marginal versions, which aggregate these covariances for coordinates of random processes. In the case of mutual independence, we study asymmetric and symmetric aggregating measures of pairwise dependence. A theoretical justification of the test procedures is established. In extensive simulation studies and examples based on a real economic data set, we investigate and compare the performance of the tests in terms of size control and power. An important finding is that tests based on distance and Hilbert-Schmidt covariances are usually more powerful than their marginal versions under linear dependence, while the reverse is true under non-linear dependence.
Rory Bunker, Teo Susnjak
Journal of Artificial Intelligence Research, Volume 73, pp 1285-1322;

Predicting the results of matches in sport is a challenging and interesting task. In this paper, we review a selection of studies from 1996 to 2019 that used machine learning for predicting match results in team sport. Considering both invasion sports and striking/fielding sports, we discuss commonly applied machine learning algorithms, as well as common approaches related to data and evaluation. Our study considers accuracies that have been achieved across different sports, and explores whether evidence exists to support the notion that outcomes of some sports may be inherently more difficult to predict. We also uncover common themes of future research directions and propose recommendations for future researchers. Although there remains a lack of benchmark datasets (apart from in soccer), and the differences between sports, datasets and features makes between-study comparisons difficult, as we discuss, it is possible to evaluate accuracy performance in other ways. Artificial Neural Networks were commonly applied in early studies, however, our findings suggest that a range of models should instead be compared. Selecting and engineering an appropriate feature set appears to be more important than having a large number of instances. For feature selection, we see potential for greater inter-disciplinary collaboration between sport performance analysis, a sub-discipline of sport science, and machine learning.
Ruben Becker, Gianlorenzo D’Angelo, Sajjad Ghobadi, Hugo Gilbert
Journal of Artificial Intelligence Research, Volume 73, pp 1251-1283;

The influence maximization paradigm has been used by researchers in various fields in order to study how information spreads in social networks. While previously the attention was mostly on efficiency, more recently fairness issues have been taken into account in this scope. In the present paper, we propose to use randomization as a mean for achieving fairness. While this general idea is not new, it has not been applied in this area. Similar to previous works like Fish et al. (WWW ’19) and Tsang et al. (IJCAI ’19), we study the maximin criterion for (group) fairness. In contrast to their work however, we model the problem in such a way that, when choosing the seed sets, probabilistic strategies are possible rather than only deterministic ones. We introduce two different variants of this probabilistic problem, one that entails probabilistic strategies over nodes (node-based problem) and a second one that entails probabilistic strategies over sets of nodes (set-based problem). After analyzing the relation between the two probabilistic problems, we show that, while the original deterministic maximin problem was inapproximable, both probabilistic variants permit approximation algorithms that achieve a constant multiplicative factor of 1 − 1/e minus an additive arbitrarily small error that is due to the simulation of the information spread. For the node-based problem, the approximation is achieved by observing that a polynomial-sized linear program approximates the problem well. For the set-based problem, we show that a multiplicative-weight routine can yield the approximation result. For an experimental study, we provide implementations of multiplicative-weight routines for both the set-based and the node-based problems and compare the achieved fairness values to existing methods. Maybe non-surprisingly, we show that the ex-ante values, i.e., minimum expected value of an individual (or group) to obtain the information, of the computed probabilistic strategies are significantly larger than the (ex-post) fairness values of previous methods. This indicates that studying fairness via randomization is a worthwhile path to follow. Interestingly and maybe more surprisingly, we observe that even the ex-post fairness values, i.e., fairness values of sets sampled according to the probabilistic strategies computed by our routines, dominate over the fairness achieved by previous methods on many of the instances tested.
Yoshihiko Ozaki, Yuki Tanigaki, Shuhei Watanabe, Masahiro Nomura, Masaki Onishi
Journal of Artificial Intelligence Research, Volume 73, pp 1209-1250;

Practitioners often encounter challenging real-world problems that involve a simultaneous optimization of multiple objectives in a complex search space. To address these problems, we propose a practical multiobjective Bayesian optimization algorithm. It is an extension of the widely used Tree-structured Parzen Estimator (TPE) algorithm, called Multiobjective Tree-structured Parzen Estimator (MOTPE). We demonstrate that MOTPE approximates the Pareto fronts of a variety of benchmark problems and a convolutional neural network design problem better than existing methods through the numerical results. We also investigate how the configuration of MOTPE affects the behavior and the performance of the method and the effectiveness of asynchronous parallelization of the method based on the empirical results.
Erkut Erdem, Menekse Kuyu, Semih Yagcioglu, Anette Frank, Letitia Parcalabescu, Barbara Plank, Andrii Babii, Oleksii Turuta, Aykut Erdem, Iacer Calixto, et al.
Journal of Artificial Intelligence Research, Volume 73, pp 1131-1207;

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions.
Reut Apel, Ido Erev, Roi Reichart, Moshe Tennenholtz
Journal of Artificial Intelligence Research, Volume 73, pp 1025-1091;

Sender-receiver interactions, and specifically persuasion games, are widely researched in economic modeling and artificial intelligence, and serve as a solid foundation for powerful applications. However, in the classic persuasion games setting, the messages sent from the expert to the decision-maker are abstract or well-structured application-specific signals rather than natural (human) language messages, although natural language is a very common communication signal in real-world persuasion setups. This paper addresses the use of natural language in persuasion games, exploring its impact on the decisions made by the players and aiming to construct effective models for the prediction of these decisions. For this purpose, we conduct an online repeated interaction experiment. At each trial of the interaction, an informed expert aims to sell an uninformed decision-maker a vacation in a hotel, by sending her a review that describes the hotel. While the expert is exposed to several scored reviews, the decision-maker observes only the single review sent by the expert, and her payoff in case she chooses to take the hotel is a random draw from the review score distribution available to the expert only. The expert’s payoff, in turn, depends on the number of times the decision-maker chooses the hotel. We also compare the behavioral patterns in this experiment to the equivalent patterns in similar experiments where the communication is based on the numerical values of the reviews rather than the reviews’ text, and observe substantial differences which can be explained through an equilibrium analysis of the game. We consider a number of modeling approaches for our verbal communication setup, differing from each other in the model type (deep neural network (DNN) vs. linear classifier), the type of features used by the model (textual, behavioral or both) and the source of the textual features (DNN-based vs. hand-crafted). Our results demonstrate that given a prefix of the interaction sequence, our models can predict the future decisions of the decision-maker, particularly when a sequential modeling approach and hand-crafted textual features are applied. Further analysis of the hand-crafted textual features allows us to make initial observations about the aspects of text that drive decision making in our setup.
Felix Brandt, Martin Bullinger, Patrick Lederer
Journal of Artificial Intelligence Research, Volume 73, pp 1093-1130;

Social choice functions (SCFs) map the preferences of a group of agents over some set of alternatives to a non-empty subset of alternatives. The Gibbard-Satterthwaite theorem has shown that only extremely restrictive SCFs are strategyproof when there are more than two alternatives. For set-valued SCFs, or so-called social choice correspondences, the situation is less clear. There are miscellaneous -- mostly negative -- results using a variety of strategyproofness notions and additional requirements. The simple and intuitive notion of Kelly-strategyproofness has turned out to be particularly compelling because it is weak enough to still allow for positive results. For example, the Pareto rule is strategyproof even when preferences are weak, and a number of attractive SCFs (such as the top cycle, the uncovered set, and the essential set) are strategyproof for strict preferences. In this paper, we show that, for weak preferences, only indecisive SCFs can satisfy strategyproofness. In particular, (i) every strategyproof rank-based SCF violates Pareto-optimality, (ii) every strategyproof support-based SCF (which generalize Fishburn's C2 SCFs) that satisfies Pareto-optimality returns at least one most preferred alternative of every voter, and (iii) every strategyproof non-imposing SCF returns the Condorcet loser in at least one profile. We also discuss the consequences of these results for randomized social choice.
Rana Tallal Javed, Osama Nasir, Melania Borit, Lois Vanhee, Elias Zea, Shivam Gupta, Ricardo Vinuesa, Junaid Qadir
Journal of Artificial Intelligence Research, Volume 73, pp 933-965;

The domain of Artificial Intelligence (AI) ethics is not new, with discussions going back at least 40 years. Teaching the principles and requirements of ethical AI to students is considered an essential part of this domain, with an increasing number of technical AI courses taught at several higher-education institutions around the globe including content related to ethics. By using Latent Dirichlet Allocation (LDA), a generative probabilistic topic model, this study uncovers topics in teaching ethics in AI courses and their trends related to where the courses are taught, by whom, and at what level of cognitive complexity and specificity according to Bloom’s taxonomy. In this exploratory study based on unsupervised machine learning, we analyzed a total of 166 courses: 116 from North American universities, 11 from Asia, 36 from Europe, and 10 from other regions. Based on this analysis, we were able to synthesize a model of teaching approaches, which we call BAG (Build, Assess, and Govern), that combines specific cognitive levels, course content topics, and disciplines affiliated with the department(s) in charge of the course. We critically assess the implications of this teaching paradigm and provide suggestions about how to move away from these practices. We challenge teaching practitioners and program coordinators to reflect on their usual procedures so that they may expand their methodology beyond the confines of stereotypical thought and traditional biases regarding what disciplines should teach and how. This article appears in the AI & Society track.
Efthimis Tsilionis, Alexander Artikis, Georgios Paliouras
Journal of Artificial Intelligence Research, Volume 73, pp 967-1023;

We present a system for online, incremental composite event recognition. In streaming environments, the usual case is for data to arrive with a (variable) delay from, and to be revised by, the underlying sources. We propose RTECinc, an incremental version of RTEC, a composite event recognition engine with formal, declarative semantics, that has been shown to scale to several real-world data streams. RTEC deals with delayed arrival and revision of events by computing all queries from scratch. This is often inefficient since it results in redundant computations. Instead, RTECinc deals with delays and revisions in a more efficient way, by updating only the affected queries. We examine RTECinc theoretically, presenting a complexity analysis, and show the conditions in which it outperforms RTEC. Moreover, we compare RTECinc and RTEC experimentally using real-world and synthetic datasets. The results are compatible with our theoretical analysis and show that RTECinc outperforms RTEC in many practical cases.
Francesco Belardinelli, Alessio Lomuscio, Vadim Malvone, Emily Yu
Journal of Artificial Intelligence Research, Volume 73, pp 897-932;

The model checking problem for multi-agent systems against specifications in the alternating-time temporal logic AT L, hence AT L∗ , under perfect recall and imperfect information is known to be undecidable. To tackle this problem, in this paper we investigate a notion of bounded recall under incomplete information. We present a novel three-valued semantics for AT L∗ in this setting and analyse the corresponding model checking problem. We show that the three-valued semantics here introduced is an approximation of the classic two-valued semantics, then give a sound, albeit partial, algorithm for model checking two-valued perfect recall via its approximation as three-valued bounded recall. Finally, we extend MCMAS, an open-source model checker for AT L and other agent specifications, to incorporate bounded recall; we illustrate its use and present experimental results.
Journal of Artificial Intelligence Research, Volume 73, pp 821-846;

We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number of agents. We circumvent this complexity through an approach that allows us to trade computation for approximation quality and dynamically coordinate actions. Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection. We evaluate our approach on the benchmark SysAdmin domain with static coordination graphs and achieve comparable performance with much lower computation cost than our MCTS baselines. We also introduce a multi-drone delivery domain with dynamic coordination graphs, and demonstrate how our approach scales to large problems on this domain that are intractable for other MCTS methods. We provide an open-source implementation of our algorithm at
Yuexiang Zhai, Christina Baek, Zhengyuan Zhou, Jiantao Jiao, Yi Ma
Journal of Artificial Intelligence Research, Volume 73, pp 847-896;

Many goal-reaching reinforcement learning (RL) tasks have empirically verified that rewarding the agent on subgoals improves convergence speed and practical performance. We attempt to provide a theoretical framework to quantify the computational benefits of rewarding the completion of subgoals, in terms of the number of synchronous value iterations. In particular, we consider subgoals as one-way intermediate states, which can only be visited once per episode and propose two settings that consider these one-way intermediate states: the one-way single-path (OWSP) and the one-way multi-path (OWMP) settings. In both OWSP and OWMP settings, we demonstrate that adding intermediate rewards to subgoals is more computationally efficient than only rewarding the agent once it completes the goal of reaching a terminal state. We also reveal a trade-off between computational complexity and the pursuit of the shortest path in the OWMP setting: adding intermediate rewards significantly reduces the computational complexity of reaching the goal but the agent may not find the shortest path, whereas with sparse terminal rewards, the agent finds the shortest path at a significantly higher computational cost. We also corroborate our theoretical results with extensive experiments on the MiniGrid environments using Q-learning and some popular deep RL algorithms.
Naoto Ohsaka
Journal of Artificial Intelligence Research, Volume 73, pp 709-735;

We study the computational complexity of two hard problems on determinantal point processes (DPPs). One is maximum a posteriori (MAP) inference, i.e., to find a principal submatrix having the maximum determinant. The other is probabilistic inference on exponentiated DPPs (E-DPPs), which can sharpen or weaken the diversity preference of DPPs with an exponent parameter p. We present several complexity-theoretic hardness results that explain the difficulty in approximating MAP inference and the normalizing constant for E-DPPs. We first prove that unconstrained MAP inference for an n × n matrix is NP-hard to approximate within a factor of 2βn, where β = 10−1013 . This result improves upon the best-known inapproximability factor of (9/8 − ϵ), and rules out the existence of any polynomial-factor approximation algorithm assuming P ≠ NP. We then show that log-determinant maximization is NP-hard to approximate within a factor of 5/4 for the unconstrained case and within a factor of 1 + 10−1013 for the size-constrained monotone case. In particular, log-determinant maximization does not admit a polynomial-time approximation scheme unless P = NP. As a corollary of the first result, we demonstrate that the normalizing constant for E-DPPs of any (fixed) constant exponent p ≥ β-1 = 101013 is NP-hard to approximate within a factor of 2βpn, which is in contrast to the case of p ≤ 1 admitting a fully polynomial-time randomized approximation scheme.
Radu Ciucanu, Pascal Lafourcade, Gael Marcadet, Marta Soare
Journal of Artificial Intelligence Research, Volume 73, pp 737-765;

The multi-armed bandit is a reinforcement learning model where a learning agent repeatedly chooses an action (pull a bandit arm) and the environment responds with a stochastic outcome (reward) coming from an unknown distribution associated with the chosen arm. Bandits have a wide-range of application such as Web recommendation systems. We address the cumulative reward maximization problem in a secure federated learning setting, where multiple data owners keep their data stored locally and collaborate under the coordination of a central orchestration server. We rely on cryptographic schemes and propose Samba, a generic framework for Secure federAted Multi-armed BAndits. Each data owner has data associated to a bandit arm and the bandit algorithm has to sequentially select which data owner is solicited at each time step. We instantiate Samba for five bandit algorithms. We show that Samba returns the same cumulative reward as the nonsecure versions of bandit algorithms, while satisfying formally proven security properties. We also show that the overhead due to cryptographic primitives is linear in the size of the input, which is confirmed by our proof-of-concept implementation.
Loïs Vanhée,
Journal of Artificial Intelligence Research, Volume 73, pp 619-631;

Ethical concerns regarding Artificial Intelligence (AI) technology have fueled discussions around the ethics training received by AI designers. We claim that training designers for ethical behaviour, understood as habitual application of ethical principles in any situation, can make a significant difference in the practice of research, development, and application of AI systems. Building on interdisciplinary knowledge and practical experience from computer science, moral psychology and development, and pedagogy, we propose a functional way to provide this training. This article appears in the special track on AI & Society.
Charles K. Assaad, Emilie Devijver, Eric Gaussier
Journal of Artificial Intelligence Research, Volume 73, pp 767-819;

We introduce in this survey the major concepts, models, and algorithms proposed so far to infer causal relations from observational time series, a task usually referred to as causal discovery in time series. To do so, after a description of the underlying concepts and modelling assumptions, we present different methods according to the family of approaches they belong to: Granger causality, constraint-based approaches, noise-based approaches, score-based approaches, logic-based approaches, topology-based approaches, and difference-based approaches. We then evaluate several representative methods to illustrate the behaviour of different families of approaches. This illustration is conducted on both artificial and real datasets, with different characteristics. The main conclusions one can draw from this survey is that causal discovery in times series is an active research field in which new methods (in every family of approaches) are regularly proposed, and that no family or method stands out in all situations. Indeed, they all rely on assumptions that may or may not be appropriate for a particular dataset.
Tiziano Fagni,
Journal of Artificial Intelligence Research, Volume 73, pp 633-672;

Predicting the political leaning of social media users is an increasingly popular task, given its usefulness for electoral forecasts, opinion dynamics models and for studying the political dimension of polarization and disinformation. Here, we propose a novel unsupervised technique for learning fine-grained political leaning from the textual content of social media posts. Our technique leverages a deep neural network for learning latent political ideologies in a representation learning task. Then, users are projected in a low-dimensional ideology space where they are subsequently clustered. The political leaning of a user is automatically derived from the cluster to which the user is assigned. We evaluated our technique in two challenging classification tasks and we compared it to baselines and other state-of-the-art approaches. Our technique obtains the best results among all unsupervised techniques, with micro F1 = 0.426 in the 8-class task and micro F1 = 0.772 in the 3-class task. Other than being interesting on their own, our results also pave the way for the development of new and better unsupervised approaches for the detection of fine-grained political leaning.
Grzegorz Chrupała
Journal of Artificial Intelligence Research, Volume 73, pp 673-707;

This survey provides an overview of the evolution of visually grounded models of spoken language over the last 20 years. Such models are inspired by the observation that when children pick up a language, they rely on a wide range of indirect and noisy clues, crucially including signals from the visual modality co-occurring with spoken utterances. Several fields have made important contributions to this approach to modeling or mimicking the process of learning language: Machine Learning, Natural Language and Speech Processing, Computer Vision and Cognitive Science. The current paper brings together these contributions in order to provide a useful introduction and overview for practitioners in all these areas. We discuss the central research questions addressed, the timeline of developments, and the datasets which enabled much of this work. We then summarize the main modeling architectures and offer an exhaustive overview of the evaluation metrics and analysis techniques.
Pavel Surynek, Roni Stern, Eli Boyarski, Ariel Felner
Journal of Artificial Intelligence Research, Volume 73, pp 553-618;

In the multi-agent path finding problem (MAPF) we are given a set of agents each with respective start and goal positions. The task is to find paths for all agents while avoiding collisions, aiming to minimize a given objective function. Many MAPF solvers were introduced in the past decade for optimizing two specific objective functions: sum-of-costs and makespan. Two prominent categories of solvers can be distinguished: search-based solvers and compilation-based solvers. Search-based solvers were developed and tested for the sum-of-costs objective, while the most prominent compilation-based solvers that are built around Boolean satisfiability (SAT) were designed for the makespan objective. Very little is known on the performance and relevance of solvers from the compilation-based approach on the sum-of-costs objective. In this paper, we start to close the gap between these cost functions in the compilation-based approach. Our main contribution is a new SAT-based MAPF solver called MDD-SAT, that is directly aimed to optimally solve the MAPF problem under the sum-of-costs objective function. Using both a lower bound on the sum-of-costs and an upper bound on the makespan, MDD-SAT is able to generate a reasonable number of Boolean variables in our SAT encoding. We then further improve the encoding by borrowing ideas from ICTS, a search-based solver. In addition, we show that concepts applicable in search-based solvers like ICTS and ICBS are applicable in the SAT-based approach as well. Specifically, we integrate independence detection, a generic technique for decomposing an MAPF instance into independent subproblems, into our SAT-based approach, and we design a relaxation of our optimal SAT-based solver that results in a bounded suboptimal SAT-based solver. Experimental evaluation on several domains shows that there are many scenarios where our SAT-based methods outperform state-of-the-art sum-of-costs search-based solvers, such as variants of the ICTS and ICBS algorithms.
Robert Ganian, Eun Jung Kim, Friedrich Slivovsky, Stefan Szeider
Journal of Artificial Intelligence Research, Volume 73, pp 535-552;

Weighted Counting for Constraint Satisfaction with Default Values (#CSPD) is a powerful special case of the sum-of-products problem that admits succinct encodings of #CSP, #SAT, and inference in probabilistic graphical models. We investigate #CSPD under the fundamental parameter of incidence treewidth (i.e., the treewidth of the incidence graph of the constraint hypergraph). We show that if the incidence treewidth is bounded, #CSPD can be solved in polynomial time. More specifically, we show that the problem is fixed-parameter tractable for the combined parameter incidence treewidth, domain size, and support size (the maximum number of non-default tuples in a constraint). This generalizes known results on the fixed-parameter tractability of #CSPD under the combined parameter primal treewidth and domain size. We further prove that the problem is not fixed-parameter tractable if any of the three components is dropped from the parameterization.
Linfeng Song, Chunlei Xin, Shaopeng Lai, Ante Wang, Jinsong Su, Kun Xu
Journal of Artificial Intelligence Research, Volume 73, pp 511-533;

Dialogue understanding has always been a bottleneck for many conversational tasks, such as dialogue response generation and conversational question answering. To expedite the progress in this area, we introduce the task of conversational aspect sentiment analysis (CASA) that can provide useful fine-grained sentiment information for dialogue understanding and planning. Overall, this task extends the standard aspect-based sentiment analysis to the conversational scenario with several major adaptations. To aid the training and evaluation of data-driven methods, we annotate 3,000 chit-chat dialogues (27,198 sentences) with fine-grained sentiment information, including all sentiment expressions, their polarities and the corresponding target mentions. We also annotate an out-of-domain test set of 200 dialogues for robustness evaluation. Besides, we develop multiple baselines based on either pretrained BERT or self-attention for preliminary study. Experimental results show that our BERT-based model has strong performances for both in-domain and out-of-domain datasets, and thorough analysis indicates several potential directions for further improvements.
Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere
Journal of Artificial Intelligence Research, Volume 73, pp 437-459;

Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO. Often work in this field is motivated by the promise of deployment of captioning systems in practical applications. However, the scarcity of data and contexts in many competition datasets renders the utility of systems trained on these datasets limited as an assistive technology in real-world settings, such as helping visually impaired people navigate and accomplish everyday tasks. This gap motivated the introduction of the novel VizWiz dataset, which consists of images taken by the visually impaired and captions that have useful, task-oriented information. In an attempt to help the machine learning computer vision field realize its promise of producing technologies that have positive social impact, the curators of the VizWiz dataset host several competitions, including one for image captioning. This work details the theory and engineering from our winning submission to the 2020 captioning competition. Our work provides a step towards improved assistive image captioning systems. This article appears in the special track on AI & Society.
Zuchao Li, Junru Zhou, Hai Zhao, Zhisong Zhang, Haonan Li, Yuqi Ju
Journal of Artificial Intelligence Research, Volume 73, pp 461-509;

In this work, we explore character-level neural syntactic parsing for Chinese with two typical syntactic formalisms: the constituent formalism and a dependency formalism based on a newly released character-level dependency treebank. Prior works in Chinese parsing have struggled with whether to de ne words when modeling character interactions. We choose to integrate full character-level syntactic dependency relationships using neural representations from character embeddings and richer linguistic syntactic information from human-annotated character-level Parts-Of-Speech and dependency labels. This has the potential to better understand the deeper structure of Chinese sentences and provides a better structural formalism for avoiding unnecessary structural ambiguities. Specifically, we first compare two different character-level syntax annotation styles: constituency and dependency. Then, we discuss two key problems for character-level parsing: (1) how to combine constituent and dependency syntactic structure in full character-level trees and (2) how to convert from character-level to word-level for both constituent and dependency trees. In addition, we also explore several other key parsing aspects, including di erent character-level dependency annotations and joint learning of Parts-Of-Speech and syntactic parsing. Finally, we evaluate our models on the Chinese Penn Treebank (CTB) and our published Shanghai Jiao Tong University Chinese Character Dependency Treebank (SCDT). The results show the e effectiveness of our model on both constituent and dependency parsing. We further provide empirical analysis and suggest several directions for future study.
Eugénio Ribeiro, Ricardo Ribeiro, David Martins de Matos
Journal of Artificial Intelligence Research, Volume 73;

From the perspective of a dialog system, it is important to identify the intention behind the segments in a dialog, since it provides an important cue regarding the information that is present in the segments and how they should be interpreted. ISO 24617-2, the standard for dialog act annotation, defines a hierarchically organized set of general-purpose communicative functions which correspond to different intentions that are relevant in the context of a dialog. We explore the automatic recognition of these communicative functions in the DialogBank, which is a reference set of dialogs annotated according to this standard. To do so, we propose adaptations of existing approaches to flat dialog act recognition that allow them to deal with the hierarchical classification problem. More specifically, we propose the use of an end-to-end hierarchical network with cascading outputs and maximum a posteriori path estimation to predict the communicative function at each level of the hierarchy, preserve the dependencies between the functions in the path, and decide at which level to stop. Furthermore, since the amount of dialogs in the DialogBank is small, we rely on transfer learning processes to reduce overfitting and improve performance. The results of our experiments show that our approach outperforms both a flat one and hierarchical approaches based on multiple classifiers and that each of its components plays an important role towards the recognition of general-purpose communicative functions.
Gabrielle Ras, Ning Xie, Marcel Van Gerven, Derek Doran
Journal of Artificial Intelligence Research, Volume 73, pp 329-397;

Deep neural networks (DNNs) are an indispensable machine learning tool despite the difficulty of diagnosing what aspects of a model’s input drive its decisions. In countless real-world domains, from legislation and law enforcement to healthcare, such diagnosis is essential to ensure that DNN decisions are driven by aspects appropriate in the context of its use. The development of methods and studies enabling the explanation of a DNN’s decisions has thus blossomed into an active and broad area of research. The field’s complexity is exacerbated by competing definitions of what it means “to explain” the actions of a DNN and to evaluate an approach’s “ability to explain”. This article offers a field guide to explore the space of explainable deep learning for those in the AI/ML field who are uninitiated. The field guide: i) Introduces three simple dimensions defining the space of foundational methods that contribute to explainable deep learning, ii) discusses the evaluations for model explanations, iii) places explainability in the context of other related deep learning research areas, and iv) discusses user-oriented explanation design and future directions. We hope the guide is seen as a starting point for those embarking on this research field.
Dominik Peters, Lan Yu, Hau Chan, Edith Elkind
Journal of Artificial Intelligence Research, Volume 73, pp 231-276;

A preference profile is single-peaked on a tree if the candidate set can be equipped with a tree structure so that the preferences of each voter are decreasing from their top candidate along all paths in the tree. This notion was introduced by Demange (1982), and subsequently Trick (1989b) described an efficient algorithm for deciding if a given profile is single-peaked on a tree. We study the complexity of multiwinner elections under several variants of the Chamberlin–Courant rule for preferences single-peaked on trees. We show that in this setting the egalitarian version of this rule admits a polynomial-time winner determination algorithm. For the utilitarian version, we prove that winner determination remains NP-hard for the Borda scoring function; indeed, this hardness results extends to a large family of scoring functions. However, a winning committee can be found in polynomial time if either the number of leaves or the number of internal vertices of the underlying tree is bounded by a constant. To benefit from these positive results, we need a procedure that can determine whether a given profile is single-peaked on a tree that has additional desirable properties (such as, e.g., a small number of leaves). To address this challenge, we develop a structural approach that enables us to compactly represent all trees with respect to which a given profile is single-peaked. We show how to use this representation to efficiently find the best tree for a given profile for use with our winner determination algorithms: Given a profile, we can efficiently find a tree with the minimum number of leaves, or a tree with the minimum number of internal vertices among trees on which the profile is single-peaked. We then explore the power and limitations of this framework: we develop polynomial-time algorithms to find trees with the smallest maximum degree, diameter, or pathwidth, but show that it is NP-hard to check whether a given profile is single-peaked on a tree that is isomorphic to a given tree, or on a regular tree.
Samer Nashed, Shlomo Zilberstein
Journal of Artificial Intelligence Research, Volume 73, pp 277-327;

Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.
Volodymyr Rospotniuk, Rupert Small
Journal of Artificial Intelligence Research, Volume 72, pp 475-505;

Pathfinding in Euclidean space is a common problem faced in robotics and computer games. For long-distance navigation on the surface of the earth or in outer space however, approximating the geometry as Euclidean can be insufficient for real-world applications such as the navigation of spacecraft, aeroplanes, drones and ships. This article describes an any-angle pathfinding algorithm for calculating the shortest path between point pairs over the surface of a sphere. Introducing several novel adaptations, it is shown that Anya as described by Harabor & Grastien for Euclidean space can be extended to Spherical geometry. There, where the shortest-distance line between coordinates is defined instead by a great-circle path, the optimal solution is typically a curved line in Euclidean space. In addition the turning points for optimal paths in Spherical geometry are not necessarily corner points as they are in Euclidean space, as will be shown, making further substantial adaptations to Anya necessary. Spherical Anya returns the optimal path on the sphere, given these different properties of world maps defined in Spherical geometry. It preserves all primary benefits of Anya in Euclidean geometry, namely the Spherical Anya algorithm always returns an optimal path on a sphere and does so entirely on-line, without any preprocessing or large memory overheads. Performance benchmarks are provided for several game maps including Starcraft and Warcraft III as well as for sea navigation on Earth using the NOAA bathymetric dataset. Always returning the shorter path compared with the Euclidean approximation yielded by Anya, Spherical Anya is shown to be faster than Anya for the majority of sea routes and slower for Game Maps and Random Maps.
, Toryn Q. Klassen, Richard Valenzano, Sheila A. McIlraith
Journal of Artificial Intelligence Research, Volume 73, pp 173-208;

Reinforcement learning (RL) methods usually treat reward functions as black boxes. As such, these methods must extensively interact with the environment in order to discover rewards and optimal policies. In most RL applications, however, users have to program the reward function and, hence, there is the opportunity to make the reward function visible – to show the reward function’s code to the RL agent so it can exploit the function’s internal structure to learn optimal policies in a more sample efficient manner. In this paper, we show how to accomplish this idea in two steps. First, we propose reward machines, a type of finite state machine that supports the specification of reward functions while exposing reward function structure. We then describe different methodologies to exploit this structure to support learning, including automated reward shaping, task decomposition, and counterfactual reasoning with off-policy learning. Experiments on tabular and continuous domains, across different tasks and RL agents, show the benefits of exploiting reward structure with respect to sample efficiency and the quality of resultant policies. Finally, by virtue of being a form of finite state machine, reward machines have the expressive power of a regular language and as such support loops, sequences and conditionals, as well as the expression of temporally extended properties typical of linear temporal logic and non-Markovian reward specification.
Chong Liu, Yu-Xiang Wang
Journal of Artificial Intelligence Research, Volume 73, pp 209-229;

Large-scale labeled dataset is the indispensable fuel that ignites the AI revolution as we see today. Most such datasets are constructed using crowdsourcing services such as Amazon Mechanical Turk which provides noisy labels from non-experts at a fair price. The sheer size of such datasets mandates that it is only feasible to collect a few labels per data point. We formulate the problem of test-time label aggregation as a statistical estimation problem of inferring the expected voting score. By imitating workers with supervised learners and using them in a doubly robust estimation framework, we prove that the variance of estimation can be substantially reduced, even if the learner is a poor approximation. Synthetic and real-world experiments show that by combining the doubly robust approach with adaptive worker/item selection rules, we often need much lower label cost to achieve nearly the same accuracy as in the ideal world where all workers label all data points.
Adrien Bolland, Ioannis Boukas, Mathias Berger, Damien Ernst
Journal of Artificial Intelligence Research, Volume 73, pp 117-171;

We consider the joint design and control of discrete-time stochastic dynamical systems over a finite time horizon. We formulate the problem as a multi-step optimization problem under uncertainty seeking to identify a system design and a control policy that jointly maximize the expected sum of rewards collected over the time horizon considered. The transition function, the reward function and the policy are all parametrized, assumed known and differentiable with respect to their parameters. We then introduce a deep reinforcement learning algorithm combining policy gradient methods with model-based optimization techniques to solve this problem. In essence, our algorithm iteratively approximates the gradient of the expected return via Monte-Carlo sampling and automatic differentiation and takes projected gradient ascent steps in the space of environment and policy parameters. This algorithm is referred to as Direct Environment and Policy Search (DEPS). We assess the performance of our algorithm in three environments concerned with the design and control of a mass-spring-damper system, a small-scale off-grid power system and a drone, respectively. In addition, our algorithm is benchmarked against a state-of-the-art deep reinforcement learning algorithm used to tackle joint design and control problems. We show that DEPS performs at least as well or better in all three environments, consistently yielding solutions with higher returns in fewer iterations. Finally, solutions produced by our algorithm are also compared with solutions produced by an algorithm that does not jointly optimize environment and policy parameters, highlighting the fact that higher returns can be achieved when joint optimization is performed.
Maximilian Fickert, Jörg Hoffmann
Journal of Artificial Intelligence Research, Volume 73, pp 67–115-67–115;

In classical AI planning, heuristic functions typically base their estimates on a relaxation of the input task. Such relaxations can be more or less precise, and many heuristic functions have a refinement procedure that can be iteratively applied until the desired degree of precision is reached. Traditionally, such refinement is performed offline to instantiate the heuristic for the search. However, a natural idea is to perform such refinement online instead, in situations where the heuristic is not sufficiently accurate. We introduce several online-refinement search algorithms, based on hill-climbing and greedy best-first search. Our hill-climbing algorithms perform a bounded lookahead, proceeding to a state with lower heuristic value than the root state of the lookahead if such a state exists, or refining the heuristic otherwise to remove such a local minimum from the search space surface. These algorithms are complete if the refinement procedure satisfies a suitable convergence property. We transfer the idea of bounded lookaheads to greedy best-first search with a lightweight lookahead after each expansion, serving both as a method to boost search progress and to detect when the heuristic is inaccurate, identifying an opportunity for online refinement. We evaluate our algorithms with the partial delete relaxation heuristic hCFF, which can be refined by treating additional conjunctions of facts as atomic, and whose refinement operation satisfies the convergence property required for completeness. On both the IPC domains as well as on the recently published Autoscale benchmarks, our online-refinement search algorithms significantly beat state-of-the-art satisficing planners, and are competitive even with complex portfolios.
Jan Maly
Journal of Artificial Intelligence Research, Volume 73, pp 1-65;

The problem of lifting a preference order on a set of objects to a preference order on a family of subsets of this set is a fundamental problem with a wide variety of applications in AI. The process is often guided by axioms postulating properties the lifted order should have. Well-known impossibility results by Kannai and Peleg and by Barbera and Pattanaik tell us that some desirable axioms – namely dominance and (strict) independence – are not jointly satisfiable for any linear order on the objects if all non-empty sets of objects are to be ordered. On the other hand, if not all non-empty sets of objects are to be ordered, the axioms are jointly satisfiable for all linear orders on the objects for some families of sets. Such families are very important for applications as they allow for the use of lifted orders, for example, in combinatorial voting. In this paper, we determine the computational complexity of recognizing such families. We show that it is \Pi_2^p-complete to decide for a given family of subsets whether dominance and independence or dominance and strict independence are jointly satisfiable for all linear orders on the objects if the lifted order needs to be total. Furthermore, we show that the problem remains coNP-complete if the lifted order can be incomplete. Additionally, we show that the complexity of these problems can increase exponentially if the family of sets is not given explicitly but via a succinct domain restriction. Finally, we show that it is NP-complete to decide for a family of subsets whether dominance and independence or dominance and strict independence are jointly satisfiable for at least one linear order on the objects.
Rodothea Myrsini Tsoupidi, Roberto Castañeda Lozano, Benoit Baudry
Journal of Artificial Intelligence Research, Volume 72, pp 1471-1505;

Modern software deployment process produces software that is uniform, and hence vulnerable to large-scale code-reuse attacks, such as Jump-Oriented Programming (JOP) attacks. Compiler-based diversification improves the resilience and security of software systems by automatically generating different assembly code versions of a given program. Existing techniques are efficient but do not have a precise control over the quality, such as the code size or speed, of the generated code variants. This paper introduces Diversity by Construction (DivCon), a constraint-based compiler approach to software diversification. Unlike previous approaches, DivCon allows users to control and adjust the conflicting goals of diversity and code quality. A key enabler is the use of Large Neighborhood Search (LNS) to generate highly diverse assembly code efficiently. For larger problems, we propose a combination of LNS with a structural decomposition of the problem. To further improve the diversification efficiency of DivCon against JOP attacks, we propose an application-specific distance measure tailored to the characteristics of JOP attacks. We evaluate DivCon with 20 functions from a popular benchmark suite for embedded systems. These experiments show that DivCon's combination of LNS and our application-specific distance measure generates binary programs that are highly resilient against JOP attacks (they share between 0.15% to 8% of JOP gadgets) with an optimality gap of 10%. Our results confirm that there is a trade-off between the quality of each assembly code version and the diversity of the entire pool of versions. In particular, the experiments show that DivCon is able to generate binary programs that share a very small number of gadgets, while delivering near-optimal code. For constraint programming researchers and practitioners, this paper demonstrates that LNS is a valuable technique for finding diverse solutions. For security researchers and software engineers, DivCon extends the scope of compiler-based diversification to performance-critical and resource-constrained applications.
Alexandra N. Uma, Tommaso Fornaciari, Dirk Hovy, Silviu Paun, Barbara Plank, Massimo Poesio
Journal of Artificial Intelligence Research, Volume 72, pp 1385-1470;

Many tasks in Natural Language Processing (NLP) and Computer Vision (CV) offer evidence that humans disagree, from objective tasks such as part-of-speech tagging to more subjective tasks such as classifying an image or deciding whether a proposition follows from certain premises. While most learning in artificial intelligence (AI) still relies on the assumption that a single (gold) interpretation exists for each item, a growing body of research aims to develop learning methods that do not rely on this assumption. In this survey, we review the evidence for disagreements on NLP and CV tasks, focusing on tasks for which substantial datasets containing this information have been created. We discuss the most popular approaches to training models from datasets containing multiple judgments potentially in disagreement. We systematically compare these different approaches by training them with each of the available datasets, considering several ways to evaluate the resulting models. Finally, we discuss the results in depth, focusing on four key research questions, and assess how the type of evaluation and the characteristics of a dataset determine the answers to these questions. Our results suggest, first of all, that even if we abandon the assumption of a gold standard, it is still essential to reach a consensus on how to evaluate models. This is because the relative performance of the various training methods is critically affected by the chosen form of evaluation. Secondly, we observed a strong dataset effect. With substantial datasets, providing many judgments by high-quality coders for each item, training directly with soft labels achieved better results than training from aggregated or even gold labels. This result holds for both hard and soft evaluation. But when the above conditions do not hold, leveraging both gold and soft labels generally achieved the best results in the hard evaluation. All datasets and models employed in this paper are freely available as supplementary materials.
Vassilina Nikoulina, Maxat Tezekbayev, Nuradil Kozhakhmet, Madina Babazhanova, Matthias Gallé, Zhenisbek Assylbekov
Journal of Artificial Intelligence Research, Volume 72, pp 1343-1384;

There is an ongoing debate in the NLP community whether modern language models contain linguistic knowledge, recovered through so-called probes. In this paper, we study whether linguistic knowledge is a necessary condition for the good performance of modern language models, which we call the rediscovery hypothesis. In the first place, we show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures. This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objectives with linguistic information. This framework also provides a metric to measure the impact of linguistic information on the word prediction task. We reinforce our analytical results with various experiments, both on synthetic and on real NLP tasks in English.
Page of 28
Articles per Page
Show export options
  Select all
Back to Top Top