Opened 9 years ago

Last modified 7 years ago

#3494 new enhancement

[PATCH] Display an OOS error message on each failed rejoin

Reported by: elexis Owned by:
Priority: Nice to Have Milestone: Backlog
Component: Network Keywords: patch, beta
Cc: Imarok Patch:

Description

The current code stops checking for OOS once any player is OOS. But it would be more useful if we would have a check every rejoin, i.e. on rejoin reset that variable.

See also #3293

Attachments (1)

t3494_oos_error_after_each_failed rejoin_v1.patch (4.6 KB ) - added by elexis 9 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 by elexis, 9 years ago

Keywords: patch review added
Milestone: BacklogAlpha 20
Priority: Should HaveNice to Have
Summary: Display an OOS error message on each failed rejoin[PATCH] Display an OOS error message on each failed rejoin

Advantages:

  • The error message will inform secondary rejoiners too of being OOS
  • If one player rejoins and is OOS because of a messed up repositry and another player with a clean repository rejoins, then only the first one is OOS. The dialog will inform the players that only one of the two guys is OOS.

Disadvantages:

  • OOS dumps will be overwritten on each failed rejoin.
    • Shouldn't actually be a big problem, as we need the hosts commands.txt anyway to reproduce. Replaying that should always reproduce the serialization error and also create the oos-dumps for host and rejoined client (debug.after.a and debug.after.b).
    • Thus updating the error message.

comment:2 by historic_bruno, 9 years ago

Typically only the first OOS is important, at that point the game is really no longer valid and should be ended after dumping the debug data. The problem with overwriting the OOS dumps is the simulation state diverges further over time, some parts may even resync if they were only OOS due to some temporary bug, confusing the debugging process. Depending on commands.txt introduces a lot of other points of failure in replaying and reconstructing the simulation state (it should be identical, it won't necessarily always be).

comment:3 by elexis, 9 years ago

Keywords: review removed

Right. How about dumping the OOS only the first time, but displaying the error on every rejoin?

As mentioned above, once we have an OOS error, it doesn't mean everyone is OOS, only some people (mostly the rejoiners or the the guys with outdated repositories) are. If we would end every game immediately upon that error we wouldn't have to play :p

comment:4 by historic_bruno, 9 years ago

It makes sense if players rejoining an OOS game get notified that it's an OOS game. Is that based on hash checks after rejoin or some flag that gets set on the first instance? The distinction is in the case of sim state that goes OOS for maybe 1 turn and then is no longer OOS (tends to happen when something isn't initialized in correct order on deserialization). Not sure which I would prefer.

comment:5 by elexis, 9 years ago

As seen in the patch, m_HasSyncError is a boolean flag that indicates whether or not _anyone_ had an oos and my suggestion is to reset it once somebody joins, allowing the dialog to open again. Also notice since #3293 it shows exactly who has a hash different from the host.

Not only new rejoiners should be notified that their state is OOS in case they are, but also the existing players should know if new rejoiners are _not_ oos.

Knowing who is and who isn't in the end helps to decide whether or not to cancel the game. If you play with 8 clients and only one of them is OOS, its often worth to continue.

Imagine the following case:

  1. Start match
  2. Alice joins with a broken client
  3. Everyone gets an OOS message that Alice is OOS
  4. Bob joins the game. Since his repository is updated and he has the same simulation hash as the host, he is not OOS.
  5. Currently noone gets informed that Bob is with us. The players expect him to be oos although he is not.

Besides having broken repositories, some OOS also disappear after some turns. For example on a18 #3107 only made the game OOS while a foundation was placed in the fog of war. Once built, the game had the same hash again.

I suggest starting the OOS checks only when people rejoin, so that we have at most one error message per join.

comment:6 by elexis, 8 years ago

Milestone: Alpha 20Backlog

Backlogging due to lack of progress.

comment:7 by elexis, 8 years ago

Component: Core engineNetwork

(set component to network)

comment:8 by scythetwirler, 7 years ago

Keywords: beta added

comment:9 by Imarok, 7 years ago

Cc: Imarok added

comment:10 by Imarok, 7 years ago

we probably should show some GUI sign (red exclamation mark) when the game is OOS

Note: See TracTickets for help on using tickets.