Opened 9 years ago

Closed 8 years ago

Last modified 8 years ago

#3450 closed defect (fixed)

[PATCH] OOS on rejoin - UnitMotion - passability map changed recently

Reported by: elexis Owned by: wraitii
Priority: Release Blocker Milestone: Alpha 19
Component: Core engine Keywords:
Cc: Patch:

Description (last modified by elexis)

Got this oos on rejoin on r17067.

The only difference is passability map changed recently in UnitMotion.

I rejoined twice, no commands were issued before the first rejoin and it didn't give an OOS. But then I moved the unit three times, rejoined and got this error.

The serializationtest fails on turn 37.

Attachments (6)

commands.txt_oos_dump_diff_r17067.7z (217.4 KB ) - added by elexis 9 years ago.
commands.txt (1.8 KB ) - added by elexis 9 years ago.
Minimal commands.txt for r17078.
OOS_fix.patch (957 bytes ) - added by Itms 9 years ago.
commands.2.txt (28.9 KB ) - added by elexis 9 years ago.
Reproduces the error on r17100.
oosdump_micro_r17129.7z (468.7 KB ) - added by elexis 9 years ago.
Building house on turn 1. OOS on turn 2. Not OOS afterwards!
t3450_do_not_serialize_m_PassabilityMapChangedRecently_v1.patch (894 bytes ) - added by elexis 9 years ago.
Resolves the discrepancy by just not serializing that value. Works all commands.txt files linked but needs further testing. Could there be a case where serializing that state is required?

Download all attachments as: .zip

Change History (27)

comment:1 by Itms, 9 years ago

Owner: set to Itms

comment:2 by elexis, 9 years ago

Description: modified (diff)
Summary: OOS on rejoin - UnitAI - passability map changed recentlyOOS on rejoin - UnitMotion - passability map changed recently

The passability map changed recently part in UnitMotion of entity 2393 is wrong. It is the id of the entity that was moved. It is true for the rejoined client, but false for the host.

comment:3 by elexis, 9 years ago

Refs r16998

comment:4 by elexis, 9 years ago

Reproduce: Move a unit and rejoin or do a serializationtest on that commands.txt.

Last edited 9 years ago by elexis (previous) (diff)

by elexis, 9 years ago

Attachment: commands.txt added

Minimal commands.txt for r17078.

comment:5 by elexis, 9 years ago

Contrary to the host, the rejoined client enters GetPassabilityGrid with m_Grid not being initialized, thus calling UpdateGrid(), which in the end emits the CMessagePassabilityMapChanged message which sets that passability map changed recently variable to true.

comment:6 by Itms, 9 years ago

Keywords: patch review added
Summary: OOS on rejoin - UnitMotion - passability map changed recently[PATCH] OOS on rejoin - UnitMotion - passability map changed recently

Having fix ideas in the shower is a nice way to start the day.

by Itms, 9 years ago

Attachment: OOS_fix.patch added

comment:7 by elexis, 9 years ago

Works for me. Did my comment help? x)

comment:8 by FeXoR, 9 years ago

Does terrain flattening influence recalculated passability maps? (AFAIK is is not supposed to so that's why this question on recalculated passability maps in general)

comment:9 by Itms, 9 years ago

elexis: not really because I got the idea before reading your comment, but it's nice if you get to know the pathfinding code a tad more :)

FeXoR: we don't have terrain flattening, and now that you mention it it will become even harder to have it with the new pathfinder.

comment:10 by Stan, 9 years ago

Noooooooooooooooooooooo...

If we can't have #2264 that means we cannot have bridges either, doesn't it ? :(

comment:11 by Itms, 9 years ago

Resolution: fixed
Status: newclosed

In 17097:

Fix a cause of serialization problems. Fixes #3450.

comment:12 by Itms, 9 years ago

Keywords: review removed

by elexis, 9 years ago

Attachment: commands.2.txt added

Reproduces the error on r17100.

comment:13 by elexis, 9 years ago

Resolution: fixed
Status: closedreopened

comment:14 by elexis, 9 years ago

Use the patch in #3460 to do the serializationtest only on the affected turn 337. If I see this correctly, the bug occured everytime one moved a unit before r17097, now it occurs less frequently.

comment:15 by Itms, 9 years ago

Owner: Itms removed
Status: reopenednew

I won't have time for programming anymore during the upcoming months.

comment:16 by elexis, 9 years ago

Reproduce: Build something.

by elexis, 9 years ago

Attachment: oosdump_micro_r17129.7z added

Building house on turn 1. OOS on turn 2. Not OOS afterwards!

comment:17 by elexis, 9 years ago

That cheat is not even necessary. Placing a foundation and waiting one turn is sufficient to trigger the OOS.

Use #3460 to only check after the second turn (i.e. m_TurnNumber > 1). The serializationtest will not fail if you skip checking the first two turns.

This means the OOS endures only one turn. Made some brief multiplayer tests and didn't get the OOS. This is consistent with the theory of having that OOS only for one turn.

This probably means that the message in r17097 (= commit above) arrives one turn late.

comment:18 by elexis, 9 years ago

By adding debug output, you can see that in the primary simulation (== host), the CMessagePassabilityMapChanged is both sent and received on turn 1 (the turn where the foundation was placed). But in the secondary simulation (== rejoined client), the message is not being sent at all!

Turn 1 (200)...
PRIMARY SIMULATION
	SENDING MESSAGE
	UnitMotion MessageReceived: m_PassabilityMapChangedRecently = true
SECONDARY SIMULATION
Simulation2.cpp(323): Assertion failed: "0 && (L"Serialization test failure")"

Thus m_PassabilityMapChangedRecently will never be set true for the rejoined client, while it is true for the host.

As the comment added in r17097 / r16998 mentions, the message is only being sent if !m_ObstructionsDirty.globallyDirty:

// Notify the units that their current paths can be invalid now.
// The passability map will be globally dirty after init, or rejoining, loading a saved game, etc.
// In those situations the paths can't be invalid.

It might be true that there is no need to inform the units that their path can be invalid. Yet we need to have the same value for PassabilityMapChangedRecently (otherwise OOS, despite having the same waypoints).

by elexis, 9 years ago

Resolves the discrepancy by just not serializing that value. Works all commands.txt files linked but needs further testing. Could there be a case where serializing that state is required?

comment:19 by elexis, 9 years ago

Keywords: review added

comment:20 by wraitii, 8 years ago

Owner: set to wraitii
Resolution: fixed
Status: newclosed

In 17152:

Revert the logic change in r16998. This commit removed the checks in UnitMotion against structures, which should have been fine except the short-range pathfinder and the long-range pathfinder are not entirely compatible (check out #3532 for details). This behavior was probably slightly optimized, but it was too clever for its own good in the current state of the pathfinder, might be reintroduced later.

This resulted in ALL "units inside obstructions" issues.

Thanks to elexis for the testing.

Fixes #3532, #3450.
Refs #3538 (still OOSes), #3410 (unitmotion remains buggy for formations, but this is only aesthethic.)
Probably affects #3471 and #3505, but those are not fixed.

comment:21 by wraitii, 8 years ago

Keywords: patch review removed

FYI this has been fixed by removing the message entirely, the core origin of the OOS has NOT been fixed. I haven't looked enough into the code to guarantee that a similar function won't cause a similar OOS in the future, but for now we are good.

Note: See TracTickets for help on using tickets.