Using the online cross-entropy method to learn relational policies for playing different games

Sarjant, Samuel; Pfahringer, Bernhard; Driessens, Kurt; Smith, Tony C.

Using the online cross-entropy method to learn relational policies for playing different games

dc.contributor.author	Sarjant, Samuel
dc.contributor.author	Pfahringer, Bernhard
dc.contributor.author	Driessens, Kurt
dc.contributor.author	Smith, Tony C.
dc.coverage.spatial	Conference held at Seoul, South Korea	en_NZ
dc.date.accessioned	2011-10-21T03:17:41Z
dc.date.available	2011-10-21T03:17:41Z
dc.date.issued	2011
dc.description.abstract	By defining a video-game environment as a collection of objects, relations, actions and rewards, the relational reinforcement learning algorithm presented in this paper generates and optimises a set of concise, human-readable relational rules for achieving maximal reward. Rule learning is achieved using a combination of incremental specialisation of rules and a modified online cross-entropy method, which dynamically adjusts the rate of learning as the agent progresses. The algorithm is tested on the Ms. Pac-Man and Mario environments, with results indicating the agent learns an effective policy for acting within each environment.	en_NZ
dc.format.mimetype	application/pdf
dc.identifier.citation	Sarjant, S., Pfahringer, B., Driessens, K. & Smith, T. (2011). Using the online cross-entropy method to learn relational policies for playing different games. In Proceeding of 2011 IEEE Conference on Computational Intelligence and Games, Seoul, South Korea, 31 August - 3 September (pp. 182-189).	en_NZ
dc.identifier.doi	10.1109/CIG.2011.6032005	en_NZ
dc.identifier.uri	https://hdl.handle.net/10289/5837
dc.language.iso	en
dc.publisher	IEEE	en_NZ
dc.relation.isPartOf	Proc 2011 IEEE Conference on Computational Intelligence and Games	en_NZ
dc.relation.uri	http://cilab.sejong.ac.kr/cig2011/?page_id=792	en_NZ
dc.rights	© 2011 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.	en_NZ
dc.source	CIG 2011	en_NZ
dc.subject	computer science	en_NZ
dc.subject	video-game	en_NZ
dc.subject	Machine learning
dc.title	Using the online cross-entropy method to learn relational policies for playing different games	en_NZ
dc.type	Conference Contribution	en_NZ
pubs.begin-page	182	en_NZ
pubs.elements-id	21084
pubs.end-page	189	en_NZ
pubs.finish-date	2011-09-03	en_NZ
pubs.start-date	2011-08-31	en_NZ