Opened 9 years ago

Last modified 3 months ago

#3348 reopened defect

OOS dump sometimes contains the simulation state of a later turn

Reported by: elexis Owned by: wraitii
Priority: Must Have Milestone: Backlog
Component: Network Keywords:
Cc: Patch:

Description (last modified by wraitii)

If the host detects an out of sync error, then it sends a CSyncErrorMessage. If the client receives it, it will write the current simulation state to the oos_dump.txt file.

However it seems that sometimes the message arrives late, i.e. when a client or host has already progressed one turn. This leads to the dumping of 'incompatible' oos_dump.txt files.

Since stem from the same game but are one turn off, they look very similar, have almost the same amount of entities, but are not comparable as many valid things changed in one turn. Most notably the rng value in the beginning of the dump is different. In general it should be the same value for all clients on the same turn. However this value changes like every turn, thus indicating that the oos dumps have been taken at different turns.

Example: attachment:oos_rejoin_pathfinder_r16876_different_rng_entitycount_waypoints_subdiv-items.7z:ticket:3292

Same has been noted by Philip on IRC February 25th:

(11:12:00) Philip`: If you're comparing two oos_dumps, you might need to verify that they were captured on the same turn (11:12:26) Philip`: (Sometimes they're not, in multiplayer games, because of the asynchronous nature of the networking)

Possible solutions:

  • don't dump the state if the simulation state has a different turn number (update the oos error message in that case too)
  • don't let any client progress one turn until the server sent a new message confirming that all players are in sync (might worsen lagging)

Change History (7)

comment:1 by historic_bruno, 9 years ago

What about just including the turn number in the dump? If it differs, that will show up clearly and indicate this problem. Of course ideally it would never happen, I have never seen only the RNG differ. Roughly how often would you say this happens, 1/1000 OOSes, 1/100, 1/10?

comment:2 by elexis, 9 years ago

If the oos dump was made on different turns, many things differ, so that is misleading and devs might think that those diffs have something to do with the actual oos reason.

I'd say it might be one in 25 runs.

And I agree, adding the turn number would be easy and indicate the problem. We might be lazy and only change that to fix the ticket :D

comment:3 by elexis, 8 years ago

Component: Core engineNetwork

(changed component to network)

comment:4 by wraitii, 3 years ago

Owner: set to wraitii
Resolution: fixed
Status: newclosed

In 25170:

Remember OOS on a per-client basis.

Change the OOS notification logic to remember the OOS-ness of each client. Reset it on client leave.
The server will thus continue checking for OOS if the OOS-client leaves.
This is convenient to ignore observer OOS, or wait for an OOS player without restarting the game.

Also add the turn number to the OOS dump, to fix #3348: particularly following rP25001 the turn is likely to not be the same between different clients.

Agree by: asterix

Differential Revision: https://code.wildfiregames.com/D3753

comment:5 by wraitii, 3 years ago

Description: modified (diff)
Milestone: BacklogAlpha 25
Summary: OOS dump sometimes contains the simulation state of the wrong turnOOS dump sometimes contains the simulation state of a later turn

The 'problem' was made worse in rP25001, since the network delay was increased, players are more likely to be on different turns at any given time.

I don't think we can fix this better than be making explicit, which the above diff does .

Closing.

comment:6 by Vladislav Belov, 3 months ago

Resolution: fixed
Status: closedreopened

That issue is pretty important for debugging OOS. Happens every time in local tests (it's pretty hard to have dumps from the same turn).

Last edited 3 months ago by Vladislav Belov (previous) (diff)

comment:7 by Vladislav Belov, 3 months ago

Milestone: Alpha 25Backlog
Priority: Should HaveMust Have
Note: See TracTickets for help on using tickets.