HQ-Learning

1 September 1997

journal article
research article
Published by SAGE Publications in Adaptive Behavior

Vol. 6 (2), 219-246
https://doi.org/10.1177/105971239700600202

Abstract

HQ-learning is a hierarchical extension of Q(λ)-learning designed to solve certain types of partially observable Markov decision problems (POMDPs). HQ automatically decomposes POMDPs into sequences of simpler subtasks that can be solved by memoryless policies learnable by reactive subagents. HQ can solve partially observable mazes with more states than those used in most previous POMDP work.

Keywords

This publication has 16 references indexed in Scilit:

Long Short-Term Memory
Neural Computation, 1997
Incremental multi-step Q-learning
Machine Learning, 1996
The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms
Machine Learning, 1996
Reinforcement learning of multiple tasks using a hierarchical CMAC architecture
Robotics and Autonomous Systems, 1995
Classifier Fitness Based on Accuracy
Evolutionary Computation, 1995
Adding Temporary Memory to ZCS
Adaptive Behavior, 1994
ZCS: A Zeroth Level Classifier System
Evolutionary Computation, 1994
Prioritized sweeping: Reinforcement learning with less data and less time
Machine Learning, 1993
Q-learning
Machine Learning, 1992
Learning Complex, Extended Sequences Using the Principle of History Compression
Neural Computation, 1992

Cited by 107 articles