Context Navigation

Changes between Version 2 and Version 3 of GettingStartedReinforcementLearning

v2	v3
32	32	Before we actually start training the agent, we will explain the scenario that the agent will be playing. The scenario is actually quite simple and designed for the agent to learn to kite. The player controls a small group of cavalry javelinists and is trying to defeat a larger group of infantry. A screenshot of the scenario is shown below:
33	33
34		[[Image(CavalryVsSpearmen.~~jpg, 2~~5%)]]
	34	[[Image(CavalryVsSpearmen.png, 75%)]]
35	35
36	36	As mentioned earlier, the state and action spaces need to be defined. We will be using very simple representations; as a result it should learn pretty quickly but will not generalize to other scenarios. Specifically, our state space will be single number which specifies the distance between the center of the player’s units and the center of the opposing units. The action space will consist of 2 (discrete) actions: attack or retreat. These are perhaps the simplest state and action space representations sufficient for a policy to learn to kite as the policy simply needs to learn that if the given input is below some (learned) value, it should retreat. Otherwise, it should attack!