#3271 closed defect (fixed)
[PATCH] OOS on rejoin - different mirage order
Reported by: | elexis | Owned by: | Itms |
---|---|---|---|
Priority: | Must Have | Milestone: | Alpha 19 |
Component: | Core engine | Keywords: | patch |
Cc: | Patch: |
Description (last modified by )
This out of sync error happened when I rejoined an a18 match. The only real difference is that two entities have swapped IDs. Their properties are identical as far as I have checked (the tree even has the same amount of wood in both versions).
--- oos_dump_ffm.txt 2015-05-27 01:12:35.000000000 +0200 +++ oos_dump_elexis.txt 2015-05-27 01:12:35.000000000 +0200 @@ -9318,15 +9318,15 @@ id: 6743 template: "mirage|gaia/flora_tree_oak_aut" id: 6744 template: "foundation|structures/rome_storehouse" id: 6745 - template: "mirage|gaia/flora_tree_oak_aut" + template: "mirage|foundation|structures/rome_storehouse" id: 6746 template: "mirage|gaia/flora_tree_oak_aut" id: 6747 - template: "mirage|foundation|structures/rome_storehouse" + template: "mirage|gaia/flora_tree_oak_aut" id: 6748 template: "mirage|gaia/flora_tree_oak_aut" id: 6749 template: "mirage|gaia/flora_tree_oak_aut" id: 6750
The other differences are follow ups (properties for the wrong IDs of those two entities) and irrelevant local entities (waypoint flags & dead units) at the end of the log.
Attachments (10)
Change History (27)
by , 9 years ago
Attachment: | t3271_oos_dump_and_commands.txt.7z added |
---|
comment:1 by , 9 years ago
Description: | modified (diff) |
---|---|
Summary: | OOS on rejoin - mirages → OOS on rejoin - swapped IDs |
comment:2 by , 9 years ago
by , 9 years ago
Attachment: | t3271_oos_dump_and_complete_commands_and_map.7z added |
---|
Luckily ffm found the commands.txt file from that time, so I can now provide the full commands.txt file of the host, the skirmish map that is equivalent to that random map and the modified commands.txt file, so that you can actually replay that match in serializationtestmode! Remember it was an a18 match.
by , 9 years ago
Attachment: | t3271_serializationtest_dumps.7z added |
---|
When you apply this patch http://trac.wildfiregames.com/attachment/ticket/3460/t3460_preview_wip_v0.1.patch to alpha 18 and do a serializationtest on the modified commands.txt
file using that custom map, then you will obstain those dumps. Notice they also contain the visibilities error of #3107, which is to be ignored. If you diff debug.after.a
and debug.after.b
, you will see that the exact same two entities mentioned above are different here too. So this counts as a full reproduce with a18!
by , 9 years ago
Attachment: | t3271_serializationtest_dumps_cleaned.7z added |
---|
If you also apply this patch http://trac.wildfiregames.com/attachment/ticket/3107/t3107_bugfix_a18.patch to alpha 18 on top of the other, then the visibilities bug #3107 will not appear in the oosdump. Thus you get a more minimal diff with only that bug.
by , 9 years ago
Attachment: | t3271_oosdump_turn_2906.txt.7z added |
---|
Dump of the simulation state one turn prior to the rejoin.
comment:3 by , 9 years ago
Summary: | OOS on rejoin - swapped IDs → OOS on rejoin - different mirage order |
---|
Using attachment:visual_replay_a18_megapatch.patch:ticket:9 you can visually replay the game.
Notice you can open the javascript console by hitting F9 and type selectAndMoveTo(6745);
or selectAndMoveTo(6747);
to see those two mirages. In visual replay, 6745 is the tree and 6747 the dock. The mirage of the storehouse is available only for a very short time (after the foundation is placed and before the building was finished). This occurs minute 23/24.
The tree mirage is located near the dock of player 1 (blue) and exists after min 23/24 when the red player walks past that dock with roughly 20 units.
Observation 1: entity 6745 and 6747 are swapped for rejoined client and host Observation 2: If you compare the dump of turn 2906 and 2907, you will see that exactly those four entities were added in that turn:
> id: 6744 > template: "foundation|structures/rome_storehouse" > id: 6745 > template: "mirage|foundation|structures/rome_storehouse" > id: 6746 > template: "mirage|gaia/flora_tree_oak_aut" > id: 6747 > template: "mirage|gaia/flora_tree_oak_aut"
Observation 3: It was player 2 (enemy of player 5) who brought the tree mirages into existance, and it is player 5 who placed the storehouse foundation.
For the rejoined client (observer), the mirage of the foundation is added immediately, while for the host and other clients it is added last.
comment:4 by , 9 years ago
You can print a stacktrace in Fogging.prototype.Init
and Fogging.prototype.LoadMirage
of Fogging.js
using error("\nINIT " + this.entity + "\n" + new Error().stack);
. It shows which part of the code initiated the mirage:
Turn 2907 (500)... ERROR: INIT 6744 ERROR: Fogging.prototype.Init@simulation/components/Fogging.js:14:2 TryConstructBuilding@simulation/helpers/Commands.js:830:6 commands.construct@simulation/helpers/Commands.js:342:3 ProcessCommand@simulation/helpers/Commands.js:37:3 ERROR: INIT 1354 ERROR: Fogging.prototype.LoadMirage@simulation/components/Fogging.js:59:2 Fogging.prototype.OnVisibilityChanged@simulation/components/Fogging.js:219:3 ERROR: INIT 1354 ERROR: Fogging.prototype.LoadMirage@simulation/components/Fogging.js:59:2 Fogging.prototype.OnVisibilityChanged@simulation/components/Fogging.js:219:3 ERROR: INIT 6744 ERROR: Fogging.prototype.LoadMirage@simulation/components/Fogging.js:59:2 Fogging.prototype.OnVisibilityChanged@simulation/components/Fogging.js:219:3
So clearly the OnVisibilityChanged
message is processed earlier for the host.
comment:5 by , 9 years ago
As mentioned above, the game went oos on turn 2907 and only those 4 entities were added in that turn (storehouse, its foundation and 2 tree mirages).
As can be seen in the original oos-dump-diff, the host (ffm) creates the storehouse-foundation mirage after the tree mirages are created, while the rejoined client (me) creates the mirage for the storehouse first.
So the easiest fix for this particular scenario would be to create the mirage of the foundation immediately after placing the foundation. There might be other scenarios where mirages are created in different orders though.
comment:6 by , 9 years ago
Priority: | Release Blocker → Must Have |
---|
In my opinion not a release blocker in comparison to the other ones, as this one occured rarely on alpha 18, while the other a19 release blockers occur all the time.
But since the solution seems to lie at hand, lets try to solve this in a19.
comment:7 by , 8 years ago
How is the list of mirages created being stored? It sounds like the serializer somehow loses the order of a list.
by , 8 years ago
Attachment: | reproduce_r17211.7z added |
---|
Here a commands.txt with a custom scenario map where the serializationtest fails on turn 135, r17211.
comment:9 by , 8 years ago
The oos-dump diff of the new sample:
id: 2617 - template: "mirage|foundation|structures/cart_civil_centre" - id: 2618 template: "mirage|foundation|structures/gaul_civil_centre" + id: 2618 + template: "mirage|foundation|structures/cart_civil_centre"
by , 8 years ago
Attachment: | commands_minimized.txt added |
---|
Minimized version of the replay above (only 9 commands)
comment:10 by , 8 years ago
What happens in that replay:
- turn 0: everyone scouts
- turn 50: red and blue place CC foundation
- turn 51+: teal cav scouts the foundation of red's cc
- turn 114: teal cav scouts the foundation of blue's cc
- turn 135:
- teal cav moved away from blue CC. The foundation enters it's FOW.
- black sets CC foundation.
- serializationtesterror.
Entities in the visual replay of the minimized commands.txt: 2599: foundation of blue's cc 2600: foundation of red's cc 2601: construction site of blue's cc 2602: construction site of red's cc 2603: teal's mirage of red's cc 2604: foundation of black's cc 2605: teal's mirage of blue's cc 2606: black's mirage of black's cc
Those should be equivalent to the host's version.
Acccording to the oosdumps of the serializationtesterror: host on turn 135:
id: 2605 template: "mirage|foundation|structures/cart_civil_centre" (blue) <- teal's mirage created when the blue's CC enters it's FOW id: 2606 template: "mirage|foundation|structures/gaul_civil_centre" (black) <- black's foundation placed this turn
rejoined client on turn 135:
id: 2605 template: "mirage|foundation|structures/gaul_civil_centre" (black) <- black's foundation placed this turn id: 2606 template: "mirage|foundation|structures/cart_civil_centre" (blue) <- teal's mirage created when the blue's CC enters it's FOW
Again the conclusion is, that the host first processes the FOW-mirage creation, while the rejoined client first processes the newly placed foundation.
comment:11 by , 8 years ago
The only way I see that happening is if commands are processed in a different order, or if messages are processed in a different order.
by , 8 years ago
comment:12 by , 8 years ago
This fixes the issue, caused by the global flag being set only for the rejoining player.
I don't think there's a much simpler fix given the number of functions those can call, I guess we could "skip" the modified entities in the global reprocessing? Is it worth it?
comment:13 by , 8 years ago
Okay scrap the above patch, it's completely buggy.
The only ways I see to fix this are to either:
-serialize everything needed so that a call to UpdateVisibilityData is exactly similar (m_dirtyVisibility...). This implies some changes to ResetDerivedData. I'm also not entirely sure it'd work. But it might still be a good, sane idea.
-make the code less reliant on orders by not sending messages immediately but storing them in a vector of some kind, then sort and make unique that vector and send messages. This should fix the OOS at a rather low cost if the number of messages at any time remains low.
comment:14 by , 8 years ago
Keywords: | patch review added |
---|---|
Summary: | OOS on rejoin - different mirage order → [PATCH] OOS on rejoin - different mirage order |
Here is a patch fixing the issue.
As understood by wraitii, assuming everything is dirty on deserialization is not always the good thing to do and it's best to make sure everything is the same when calling UpdateVisibilityData.
The attached patch replaces m_GlobalVisibilityUpdate = true
by a real serialization of the dirtiness values. To achieve that, I had to add a "deserialization in progress" flag to the range manager and I also used it to remove an old function parameter.
This patch should be checked for bugs (elexis?) and for style, especially that new flag (leper?) ;)
by , 8 years ago
Attachment: | visibility_serialization.patch added |
---|
comment:16 by , 8 years ago
Keywords: | review removed |
---|
Maybe related to #3375 ?