36 | 36 | As mentioned earlier, the state and action spaces need to be defined. We will be using very simple representations; as a result it should learn pretty quickly but will not generalize to other scenarios. Specifically, our state space will be single number which specifies the distance between the center of the player’s units and the center of the opposing units. The action space will consist of 2 (discrete) actions: attack or retreat. These are perhaps the simplest state and action space representations sufficient for a policy to learn to kite as the policy simply needs to learn that if the given input is below some (learned) value, it should retreat. Otherwise, it should attack! |