Opened 9 years ago
Last modified 3 years ago
#3373 new enhancement
Reduce latency by using multiple ENet channels
Reported by: | elexis | Owned by: | |
---|---|---|---|
Priority: | If Time Permits | Milestone: | Backlog |
Component: | Network | Keywords: | |
Cc: | Patch: |
Description (last modified by )
0 A.D. uses the ENet protocol (http://enet.bespin.org/) in order to send reliable, sequenced messages that are automatically fragmented and reassembled without using the slow TCP protocol.
Issue
Currently 0 A.D. only uses one channel for all traffic, thus introducing unnecessary lag.
From NetSession.cpp
and NetServer.cpp
:
static const int CHANNEL_COUNT = 1;
and NetHost.h
static const int DEFAULT_CHANNEL = 0;
The only call to enet_peer_send
in 0 A.D. occurs in CNetHost::SendMessage
and uses DEFAULT_CHANNEL
.
Reason
Sequenced packets can cause lag As ENet sequences the packets, it doesn't deliver packets to 0 A.D. if previous packets were not received yet. Waiting with the delivery of packets means latency / lag. See http://enet.bespin.org/Features.html (Sequencing)
ENet guarantees that no packet with a higher sequence number will be delivered before a packet with a lower sequence number, thus ensuring packets are delivered exactly in the order they are sent. For reliable packets, if a higher sequence number packet arrives, but the preceding packets in the sequence have not yet arrived, ENet will stall delivery of the higher sequence number packets until its predecessors have arrived.
Using multiple channels reduces that lag Different channels can be used to allow concurrent traffic. See http://enet.bespin.org/Features.html (Channels)
Since ENet will stall delivery of reliable packets to ensure proper sequencing, and consequently any packets of higher sequence number whether reliable or unreliable, in the event the reliable packet's predecessors have not yet arrived, this can introduce latency into the delivery of other packets which may not need to be as strictly ordered with respect to the packet that stalled their delivery. To combat this latency and reduce the ordering restrictions on packets, ENet provides multiple channels of communication over a given connection. Each channel is independently sequenced, and so the delivery status of a packet in one channel will not stall the delivery of other packets in another channel.
Implementation
How many channels to use: The following things should run on a separate channel:
- Default: Simulation relevant and miscellaneous
- Chat
- File-Transfers (rejoined clients)
- OOS-Checks
The channel numbers should be hardcoded in NetMessages.h
where the other protocol constants reside.
Why chat messages can have their own channel: It is not a problem if the game continues while not having received all chat messages. Also it is not a problem if we process chat messages while the previous simulation commands of that client haven't been received yet. Cheats are not sent as chat but parsed locally and then sent as a simulation command (see comment:10:ticket:3545). So there can't be any orderding issues that might cause OOS.
Why downloads can have their own channel: If a client rejoins, then it will download the serialized simulation state of the host in many fragments. Meanwhile the game / simulation continues (which was implemented so that players don't have to wait for the download to finish). Since the sequencing is done independently for each client and since the rejoining client can't send any other, it shouldn't be required in theory. However it is good practice to move concurrent downloads to another channel and in future we might add other download types. (In that case we should probably use one channel per concurrent download).
Why OOS-Check messages can have their own channel:
The OOS-check messages NMT_SYNC_CHECK
and NMT_SYNC_ERROR
should also run on a different channel. The Sync-Check and Sync-Error messages carry the turn number and hash-value, see NetMessages.h
:
START_NMT_CLASS_(SyncCheck, NMT_SYNC_CHECK) NMT_FIELD_INT(m_Turn, u32, 4) NMT_FIELD(CStr, m_Hash) END_NMT_CLASS() START_NMT_CLASS_(SyncError, NMT_SYNC_ERROR) NMT_FIELD_INT(m_Turn, u32, 4) NMT_FIELD(CStr, m_HashExpected) END_NMT_CLASS()
The hash-matching is already done asynchroneously in CNetServerTurnManager::NotifyFinishedClientUpdate
, (which is also why OOS dumps sometimes contain simulation states of different turns #3348):
// Find the newest turn which we know all clients have simulated // For every set of state hashes that all clients have simulated, check for OOS ... // Oh no, out of sync // Tell everyone about it
Change History (6)
comment:1 by , 9 years ago
Description: | modified (diff) |
---|
comment:2 by , 9 years ago
Description: | modified (diff) |
---|
comment:3 by , 8 years ago
Priority: | Should Have → If Time Permits |
---|
comment:4 by , 8 years ago
Owner: | removed |
---|
Won't change much since most of the lag is "performance" lag. If actual network lag appears, the user will have a notification #3264.