Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#4316 closed defect (fixed)

[PATCH] OOS on rejoin - recursive template loading (temple of vesta)

Reported by: elexis Owned by: leper
Priority: Release Blocker Milestone: Alpha 22
Component: UI & Simulation Keywords: patch
Cc: Patch:

Description

In the following game with r18902 a client rejoined and triggered an OOS many turns later. It is reproducible with the rejointest.

Attachments (4)

uran_oos.7z (1.9 MB ) - added by elexis 4 years ago.
Thanks a lot to Uran finding the replay!!
danger89_logs.tar.gz (1.8 MB ) - added by Melroy van den Berg 4 years ago.
some extra crashes and oos (version 20), could be helpful?
commands_mini.txt (5.0 KB ) - added by Itms 4 years ago.
0001-Reset-ParamNode-s-script-cache-in-a-few-edge-cases.-.patch (6.2 KB ) - added by leper 4 years ago.

Change History (19)

by elexis, 4 years ago

Attachment: uran_oos.7z added

Thanks a lot to Uran finding the replay!!

by Melroy van den Berg, 4 years ago

Attachment: danger89_logs.tar.gz added

some extra crashes and oos (version 20), could be helpful?

comment:1 by elexis, 4 years ago

Summary: OOS on rejoin - Ishtar GateOOS on rejoin - Missing loyalty regen aura

Entity 9956, the temple of maxticatrix got the loyalty aura at the host but not for the rejoined client. Rejoin started turn 2287, the foundation was placed on turn 2680, construction finished on roughly turn 2786, OOS on turn 2786.

Last edited 4 years ago by elexis (previous) (diff)

comment:2 by Itms, 4 years ago

Here is a minimal reproducing test case. Doing a rejointest from turn 2 reproduces the problem.

by Itms, 4 years ago

Attachment: commands_mini.txt added

comment:3 by Itms, 4 years ago

More information: the problem happens if the rejoining happens before the foundation is placed. If you rejoin on turn 4 of the minimal test, the problem won't happen.

comment:4 by elexis, 4 years ago

Reverting r18753 seems to solve it

comment:5 by elexis, 4 years ago

In 18911:

Revert r18753 as it revealed an OOS on rejoin in the engine code, refs #4232 #4316.

comment:6 by elexis, 4 years ago

When Init of the Auras component of the temple of vesta is called for the rejoined client, this.GetAuraNames() which parses this.template already returns the wrong list of auras. I suspect a bug in ParamNode.cpp.

Last edited 4 years ago by elexis (previous) (diff)

comment:7 by elexis, 4 years ago

Milestone: Alpha 21Alpha 22
Priority: Must HaveRelease Blocker
Summary: OOS on rejoin - Missing loyalty regen auraOOS on rejoin - recursive template loading (temple of vesta)

comment:8 by leper, 4 years ago

I can't reproduce this when reverting r18911 locally and applying v2.1 from #4242 and running with /pyrogenesis -mod=public -replay=/path/to/commands_mini.txt -rejointest=2 (or 3, or 4). Am I missing something? (Clean build, cleaned cache recently, nothing in the user mod, no other mods mounted).

comment:9 by elexis, 4 years ago

Nope, that's the right way to reproduce it. Update to r18910, executing pyrogenesis with the arguments you posted and waiting until turn 46. It should end with the serializationtestfailure message and write those dumps.

comment:10 by leper, 4 years ago

Got it to work, dunno why it wasn't working earlier.

There seems to be an issue with not resetting the cache in CParamNode, as the call that gets the wrong contents (even though the paramNode itself has the correct ones) returns the cached value which is outdated. So the correct fix would be finding out where we do not reset the cache. (Disabling the caching, fixes the problem, but that might make repeated conversions of the same value (that we are likely to do in a few places, unless someone proves me wrong) slower and shouldn't be done).

comment:11 by elexis, 4 years ago

Thanks for finding that out, I got lost in that file. Disabling cache sounds like a workaround to me too. The only other place of this kind of recursive template loading I could find was the promotion system, but I assume we would have found the bug much earlier if it were directly affected.

(Perhaps you had reverted r18753 manually without reverting the production queue change added in r18911)

Last edited 4 years ago by elexis (previous) (diff)

comment:12 by leper, 4 years ago

Keywords: patch review added
Owner: set to leper
Status: newassigned

This might reset the cache a bit more than needed, but as that is only setting something to NULL it shouldn't be that bad.

Also note that text only elements with attributes combined with inheritance means that the text itself is lost. (The test can actually be committed before the fix, or in case someone wants to test this.)

comment:13 by elexis, 4 years ago

In 18941:

Add a template manager test that reveals an inheritance error in ParamNode.cpp which can trigger an OOS on rejoin. Patch by leper, refs #4316.

comment:14 by elexis, 4 years ago

Resolution: fixed
Status: assignedclosed

In 18942:

Fix an OOS on rejoin caused by ParamNode.cpp using an invalid node cache. Patch by leper, fixes #4316.

Reuse the temple of vesta inheritance from r18753 which was affected by this bug.

comment:15 by elexis, 4 years ago

Keywords: review removed
Summary: OOS on rejoin - recursive template loading (temple of vesta)[PATCH] OOS on rejoin - recursive template loading (temple of vesta)

Thanks for your patch and the proper approach to write a test that fails before adding the fix! (Don't want to imagine how long it would have taken without your contribution!)

  • Fixed mixed leading whitespace in the three `</Test2A> occurances indicated (by this script that is triggered before browsing a diff here)
  • Added svn native lineending property for the new files and the other ones in the same directory.

Edit: Should have added your git commit message to the svn commit.

Last edited 4 years ago by elexis (previous) (diff)
Note: See TracTickets for help on using tickets.