Simulation System Architecture
Part of the simulation documentation. (See pages linked from there, for details on the implementation of the concepts described here.)
The entity system is the infrastructure that runs the game's simulation/gameplay code. (The gameplay itself is not part of this system - the system is not even specific to RTS games - but it is all built on top of this system, and influences the system's design.)
The design is based most closely on the one presented in GPG5 (Bjarne Rene: Component Based Object Management; Game Programming Gems 5, 2005, page 25). Some other useful references: Evolve Your Hierarchy, Game Object Structure: Inheritance vs. Aggregation, Scott Bilas's GDC 2002 presentation: A Data-Driven Game Object System, and a few of the things linked from those.
An important concept in the entity system is the entity. This represents any kind of 'thing' in the simulation world - a person, a tree, a rock, an arrow, and more abstract things like event triggers, players, and player input controllers.
Entities consist of a set of components. A component is a largely self-contained piece of data and code, responsible for one part of the behaviour of an entity. One component might be responsible for rendering the entity; another for keeping track of its location in the world; another for tracking its health and reducing it when damaged and killing the entity when it reaches zero.
Each component is an object instance in the C++ code. However, there is no C++ object representing an entity - each component is tied to an entity ID (an arbitrary integer), and an entity exists only as a concept defined by the set of components with the same entity ID.
Components have two communication mechanisms: one-to-one communication with another component, by retrieving the component with a QueryInterface call (described in more detail later) and then calling methods on it; and one-to-many communication by posting or broadcasting messages, which will be received by any component that has chosen to subscribe to that message type.
The goal of the system is to ease development of moderately complex gameplay code, and to easily adapt to changing gameplay requirements. There is a focus on modularity and flexibility - it should be easy to read, understand, modify and replace a component. Therefore a component should be small, with few dependencies on other components. (These are often conflicting requirements - lots of small components require more dependencies than a few large components. Thoughtful design is needed.)
Components should be written in C++ only when necessary for run-time performance, memory usage, or to interact with C++ parts of the game engine (e.g. the renderer). Run-time performance should only be a concern for code that is executed every frame (e.g. position interpolation functions), or executed every simulation turn for a large number of entities (e.g. checking for any unit coming into range). Components should be initially written in JS, and if profiling indicates they are slow, then try to optimise the JS or call it less often (e.g. run on a timer rather than on every simulation turn), and if it's still unfixably slow then it could be rewritten in C++.
Component scripts support hotloading: while the game is running, you can edit and save a script file, and it will be immediately reloaded and used in the game with no need to stop or restart. (The data associated with each component will not be changed at all, only the code.)
Most entities will behave similarly to each other, but there are a few cases where we would like to have differing implementations of one type of component.
For example, most entities should have some kind of Position component, which responds to MoveTo calls during a simulation update and can be queried by the renderer for the location in the current frame. (Frames are much more frequent than simulation updates). For entities that move smoothly, the component should typically respond to the renderer by interpolating from its position in the previous simulation turn to the current turn, and therefore it needs to remember its previous position at the start of each turn. For entities that are not expected to move in straight lines (e.g. ballistic units like arrows), linear interpolation will be inaccurate. For entities that are not expected to move at all (e.g. trees), remembering the previous position every turn is wastefully inefficient.
It would be possible for a position component implementation to have a flag that switches between linear and parabolic interpolation, but the code would become increasingly complex as more special cases were added, and it would not be able to optimise the storage and computation of positions for non-moving entities (of which there might be tens of thousands).
Instead, we define Position as an interface. An interface does not define any code or data, but does define a list of methods (e.g. MoveTo and GetInterpolatedPosition). We then define a number of component types that implement the interface, providing the code and data - e.g. the PositionInterpolated and PositionStatic component types both implement the Position interface. A component is an instance of a component type.
An entity can only have (at most) one component that implements a particular interface (that is, it cannot have both PositionInterpolated and PositionStatic). Any code that interacts with the entity cannot tell what component types it uses - the code can only use QueryInterface to get a pointer to whatever component implements the Position interface, and call the common methods declared by that interface.
Direct method calls between components are sometimes necessary, but they force component implementations to know details of other components. For example, a Position component may want to notify many other components when the entity moves (e.g. any components that want to detect when an entity is within a certain range, or set it on fire if it's walking on lava), and it would be bad (complex, inflexible) if the Position component type's code had to know about every other component type it should notify.
The message passing system helps with this case. Component types can subscribe to a particular message type, and components can post or broadcast a message with a type and associated data, which will be received by all subscribed components. (Post sends the message to the components with a specific entity ID, broadcast sends to component of all entities). For example, the burn-on-lava component can subscribe to the PositionChanged message type, and the Position component can then post a PositionChanged message to its own entity ID, and the burn component's message handler function will be called. The Position component only has to know about this message type, not about any of the components that use it.
Messages are one-to-many communication (any number of components can receive a single posted message), and they are one-way (it is not possible to return a value in response to a message). Components are notified in a consistent but arbitrary order.
(TODO: the post vs broadcast semantics are not very well designed currently.)
The simulation system needs to support three related features:
- Serializing the complete simulation state to a byte stream, for saved games and potentially for joining (or rejoining after disconnection) in-progress games.
- Computing a checksum of the simulation state in multiplayer games, to detect out-of-sync errors before they lead to significant divergences in gameplay.
- Dumping the state of the simulation or of a specific entity or component into a roughly human-readable format, for debugging.
Each component therefore has to implement serialization (and deserialization) support, passing its internal data to the serialization API (which will either save it in an efficient byte format, or pass it into a checksum computation, or convert it to human-readable text).
For scripted components this is automatic. But C++ components have to implement it manually. The general rule is that the following series of method calls must result in components that have the same checksum and behave identically:
- constructor() -> Init(context, paramNode) -> ...simulation turns... -> Serialize(stream)
- constructor() -> Deserialize(context, paramNode, stream)
and the serialized data must be identical regardless of the compiler, OS, CPU, etc. That means the component's relevant internal state must not include e.g. size_t values (they differ on 32-bit vs 64-bit platforms) or floats (we don't trust compilers to treat them precisely).
For components that are large or frequently used, the serialized output for saved games should be as efficient as possible. Any internal caches that can be safely reconstructed after deserialization should be omitted. Any data that was initialised from paramNode should not be serialized unless it has changed - store some kind of placeholder value instead and reconstruct the value in Deserialize.
Entities are constructed from an entity template, defined in XML (typically in the binaries/data/mods/public/simulation/templates/ directory). The root element of the XML file contains one element per component, giving the component type's name. Each component element contains initialisation data for that component - typically either empty, or a series of elements giving various data fields. This data is passed to the component's Init method.
Some gameplay code is 'global', and it only makes sense to have one copy of it; but it can benefit from being written in the component infrastructure, e.g. for serialization and message passing and script interfacing. This code can be implemented as a system component, which is like a normal component but (as a convention, not enforced by the system) given the entity ID SYSTEM_ENTITY. It can then be accessed and used by any simulation code in the same way as a normal component.