Opened 8 years ago

Last modified 5 years ago

#4242 closed enhancement

[PATCH] Rejoin-testing tool — at Version 1

Reported by: Itms Owned by: Itms
Priority: Should Have Milestone: Alpha 22
Component: Core engine Keywords: patch
Cc: Patch:

Description (last modified by elexis)

Even though -serializationtest allows us to spot any problem that could happen on rejoin, it's really slow and does not allow us to quickly reproduce the issue.

Attached patch, based on a hack by wraitii and elexis, adds a -rejointest=N cl option that is similar to -serializationtest but exactly simulates a rejoin. On turn N, a secondary simulation is set up with the serialized data from the primary one, and then both run but without deserializing everything on each turn: only states are compared.

This solution is really fast and you only need the host's commands.txt and the turn number where the guest joined (easy to find in their commands.txt).

Using this along with -ooslog makes debugging an OOS far less tedious.

Change History (3)

by Itms, 8 years ago

Attachment: rejointest.patch added

comment:1 by elexis, 8 years ago

Description: modified (diff)
Keywords: review added; rfc removed
Milestone: Alpha 22Alpha 21

The patch is based on wraitiis patch in attachment:serializationChange.patch:ticket:3292, the ticket is a duplicate of #3460.

  • The readme entry and duplicate comment in the code should be rephrased to mention that a frequent use case of the tool is reproducing an OOS on rejoin where the given N should be the first turn number of the commands.txt file of the rejoined client, that went OOS after the rejoin.
  • Compile warnings:
    ../../../source/simulation2/Simulation2.cpp: In member function ‘void CSimulation2Impl::Update(int, const std::vector<SimulationCommand>&)’:
    ../../../source/simulation2/Simulation2.cpp:364:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
      if (m_RejoinTestTurn == m_TurnNumber)
                           ^
    ../../../source/simulation2/Simulation2.cpp:378:52: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
      if (m_EnableSerializationTest || m_RejoinTestTurn == m_TurnNumber)
    
  • Sure that SAFE_DELETE wouldn't be preferable (convention perhaps?), even if it's never null?
  • Thoroughly tested: The patch is in essence still the same as in alpha 19 when #3292 solved with it. I tested it again today, did some rejoins until one finally triggered an OOS. I used -rejointest with the turnnumber of the client's commands.txt file and could reproduce the OOS. After applying the fix to the OOS I was testing (r18752, #4239), the rejointest passed. The rejoinmode produces the binary and textual simstate and hashes before and after the turn (exactly like the serializationtestmode). As I could exactly reproduce the problem based on the generated data, I have to assume this code to be correct from a blackbox test. Reading the code also shows me that it's very user-friendly implemented (in contrast to the prior WIPs).

Can we have alpha 21 OOS-debuggable out of the box?

Last edited 8 years ago by elexis (previous) (diff)

by elexis, 8 years ago

Attachment: reproduce_r18756.7z added
Note: See TracTickets for help on using tickets.