Opened 13 months ago

Closed 11 months ago

Last modified 11 months ago

#4242 closed enhancement (fixed)

[PATCH] Rejoin-testing tool

Reported by: Itms Owned by: Itms
Priority: Should Have Milestone: Alpha 22
Component: Core engine Keywords: patch
Cc: Patch:

Description (last modified by elexis)

Even though -serializationtest allows us to spot any problem that could happen on rejoin, it's really slow and does not allow us to quickly reproduce the issue.

Attached patch, based on a hack by wraitii and elexis, adds a -rejointest=N cl option that is similar to -serializationtest but exactly simulates a rejoin. On turn N, a secondary simulation is set up with the serialized data from the primary one, and then both run but without deserializing everything on each turn: only states are compared.

This solution is really fast and you only need the host's commands.txt and the turn number where the guest joined (easy to find in their commands.txt).

Using this along with -ooslog makes debugging an OOS far less tedious.

Attachments (5)

rejointest.patch (12.3 KB) - added by Itms 13 months ago.
reproduce_r18756.7z (395.8 KB) - added by elexis 13 months ago.
rejointest_v2.patch (15.2 KB) - added by elexis 13 months ago.
Rebased
rejointest_v2.1.patch (15.2 KB) - added by elexis 13 months ago.
Fixes compiler warning about signed / unsigned comparison.
rejointest_alpha20.patch (13.4 KB) - added by elexis 13 months ago.
Same patch for alpha 20 (for debugging purposes only)

Download all attachments as: .zip

Change History (12)

Changed 13 months ago by Itms

Attachment: rejointest.patch added

comment:1 Changed 13 months ago by elexis

Description: modified (diff)
Keywords: review added; rfc removed
Milestone: Alpha 22Alpha 21

The patch is based on wraitiis patch in attachment:serializationChange.patch:ticket:3292, the ticket is a duplicate of #3460.

  • The readme entry and duplicate comment in the code should be rephrased to mention that a frequent use case of the tool is reproducing an OOS on rejoin where the given N should be the first turn number of the commands.txt file of the rejoined client, that went OOS after the rejoin.
  • Compile warnings:
    ../../../source/simulation2/Simulation2.cpp: In member function ‘void CSimulation2Impl::Update(int, const std::vector<SimulationCommand>&)’:
    ../../../source/simulation2/Simulation2.cpp:364:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
      if (m_RejoinTestTurn == m_TurnNumber)
                           ^
    ../../../source/simulation2/Simulation2.cpp:378:52: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
      if (m_EnableSerializationTest || m_RejoinTestTurn == m_TurnNumber)
    
  • Sure that SAFE_DELETE wouldn't be preferable (convention perhaps?), even if it's never null?
  • Thoroughly tested: The patch is in essence still the same as in alpha 19 when #3292 solved with it. I tested it again today, did some rejoins until one finally triggered an OOS. I used -rejointest with the turnnumber of the client's commands.txt file and could reproduce the OOS. After applying the fix to the OOS I was testing (r18752, #4239), the rejointest passed. The rejoinmode produces the binary and textual simstate and hashes before and after the turn (exactly like the serializationtestmode). As I could exactly reproduce the problem based on the generated data, I have to assume this code to be correct from a blackbox test. Reading the code also shows me that it's very user-friendly implemented (in contrast to the prior WIPs).

Can we have alpha 21 OOS-debuggable out of the box?

Last edited 13 months ago by elexis (previous) (diff)

Changed 13 months ago by elexis

Attachment: reproduce_r18756.7z added

comment:2 Changed 13 months ago by Itms

Milestone: Alpha 21Alpha 22

I don't want 1) to waste time (mine or anyone else's) on completing this patch before FF 2) to risk breaking the serializationtest mode just before a release (this patch still touches quite a lot of things)

As it's a thing used by devs, it's not a big deal to just apply the patch.

Changed 13 months ago by elexis

Attachment: rejointest_v2.patch added

Rebased

Changed 13 months ago by elexis

Attachment: rejointest_v2.1.patch added

Fixes compiler warning about signed / unsigned comparison.

Changed 13 months ago by elexis

Attachment: rejointest_alpha20.patch added

Same patch for alpha 20 (for debugging purposes only)

comment:3 Changed 12 months ago by elexis

Keywords: rfc added; review removed
  • It were great if we could have the Serialization test mode does not support random maps error on turn 1, instead of simulating the first N turns and then failing.
  • Also thought about giving two arguments (host- and rejoined client replay) but that's not worth the effort to implement

comment:4 Changed 12 months ago by elexis

  • A logmessage / debug_printf should be added to indicate the start of the rejointest

comment:5 Changed 12 months ago by elexis

  • Serialization test failure should say Rejointest failure

comment:6 Changed 11 months ago by elexis

Resolution: fixed
Status: newclosed

In 18940:

An awesome Rejoin-test tool by wraitii and Itms, fixes #4242, refs #3460.

Contrary to the serializationtest, initializes the secondary simstate only once
before progressively applying updates.
Thus reproducing actual multiplayer rejoining, enabling developers to
analyze OOS reports solely from the replay file of the rejoined and a non-rejoined client.

comment:7 Changed 11 months ago by elexis

Keywords: rfc removed

Had been looking forward for 13 months to see this feature being committed. :)

  • The int64_t discussion ended with a "hmm" as I recall correctly after laying out that we need to consider -1 in that part if we don't want to add a boolean member.
  • Adding the debug_printf to the start of the rejointest
  • Not adding a debug_printf for "Rejointest failure", as both tests should be considered a serializationtest
Note: See TracTickets for help on using tickets.