Multi-Agent Reinforcement Learning Based on K-Means Clustering in Multi-Robot Cooperative Systems

1 March 2011

journal article
Published by Trans Tech Publications, Ltd. in Advanced Materials Research

Vol. 216, 75-80
https://doi.org/10.4028/www.scientific.net/amr.216.75

Abstract

To solve the curse of dimensionality problem in multi-agent reinforcement learning, a learning method based on k-means is presented in this paper. In this method, the environmental state is represented as key state factors. The state space explosion is avoided by classifying states into different clusters using k-means. The learning rate is improved by assigning different states to existent clusters, as well as corresponding strategy. Compared to traditional Q-learning, our experimental results of the multi-robot cooperation show that our scheme improves the team learning ability efficiently. Meanwhile, the cooperation efficiency can be enhanced successfully.

Keywords

This publication has 9 references indexed in Scilit:

On Experiences in a Complex and Competitive Gaming Domain: Reinforcement Learning Meets RoboCup
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2007
Game Theory As A Tool To Strategize As Well As Predict Nodes Behavior In Peer-to-Peer Networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Efficient Solution Algorithms for Factored MDPs
Journal of Artificial Intelligence Research, 2003
An efficient k-means clustering algorithm: analysis and implementation
Ieee Transactions On Pattern Analysis and Machine Intelligence, 2002
The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models
Journal of Artificial Intelligence Research, 2002
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
Artificial Intelligence, 1999
Building agent teams using an explicit teamwork model and learning
Artificial Intelligence, 1999
Reinforcement Learning: An Introduction
IEEE Transactions on Neural Networks, 1998
Q-learning
Machine Learning, 1992

Cited by 3 articles