Opened 8 years ago

Last modified 6 years ago

#4239 closed defect

[PATCH] Serialize the data in DataTemplateManager to avoid OOS on rejoin — at Version 6

Reported by: Itms Owned by: Itms
Priority: Release Blocker Milestone: Alpha 21
Component: UI & Simulation Keywords: patch
Cc: Patch:

Description (last modified by elexis)

The data in DataTemplateManager (r18100) is only read from the disk so we don't serialize it. However, on rejoin, if a template had been loaded by the host, the next time this template is loaded, the guest will load a new JS object (which has the same content as the host's but it's not the same object).

Upon serialization, the objects will have different backrefs on the two instances, leading to a difference in the binary simstates.

Change History (7)

by Itms, 8 years ago

comment:1 by Itms, 8 years ago

Resolution: fixed
Status: newclosed

In 18752:

Fix a frequent rejoining OOS. We actually need to serialize the data loaded from the disk, because JS objects in the memory or newly loaded from the disk will not behave the same way.

Fixes #4239, refs #3834.

comment:2 by Itms, 8 years ago

Keywords: review removed

comment:3 by elexis, 8 years ago

How he found out:

Itms joined my host with -ooslog enabled, rejoined until it became OOS. Then he produced the binary simstate of the host using -ooslog. Then compared the binary simstates (since the textual ones are identical).

In my reproduction, the first differing byte was near a "female_inspiration" string, thus occuring in the Aura component. This byte is 0x03 for one simstate and 0x08 in the other. The first byte of unserialized data is the data type defined by SerializedScriptTypes.h, i.e. it stands for an object type and a backref respectively.

Since the textual simstates were identical, the objects would have the same content. Since the binary simstates differ, the objects differ in their serialized representation though.

Thus looking at how the Auras component serializes. Prototype variables in Init are serialized and this.auras is an object which is given by cmpDataTemplateManager.GetAuraTemplate. The latter returns a new object from Engine.ReadJSONFile file, thus having the same content but a different serial representation!

The fix was to use the this.allTechs object from the host, which can be accomplished by letting the host serialize it. The code removed by that commit means that the DataTemplateManager component uses the default serializing mechanism and serializes everything in Init.


Using #4242 it is possible to simulate the rejoin determined by the prior experiment and prove that the unpatched code fails the serializationtest, while the patched code passes.

Thanks for fixing this hidden release blocker OOS Itms!

comment:4 by elexis, 8 years ago

Resolution: fixed
Status: closedreopened

The commit (r18752) causes (or otherwise triggers) multiplayer replays to become OOS on the first non-quick hash comparison. The revision before that doesn't show this behavior. Singleplayer replays are not hashed, so they don't trigger this bug.

comment:5 by elexis, 8 years ago

I have started a multiplayergame with -ooslog, then replayed it visually with -ooslog. The simstates differ on turn 1 (which is the first turn to be executed).

1c1
< State hash: 8e0949cf8cb2fcdf22cb68301cd057ef
---
> State hash: 6f879f720f797e88ea3b08fb6e7daf3f
91432,91519d91431
<     },
<     "wall_garrisoned": {
<       "type": "garrisonedUnits",
<       "affects": [
<         "Soldier"
<       ],
<       "modifications": [
<         {
<           "value": "Armour/Pierce",
<           "add": 3
<         },
<         {
<           "value": "Armour/Hack",
<           "add": 3
<         },
<         {
<           "value": "Armour/Crush",
<           "add": 3
<         },
<         {
<           "value": "Vision/Range",
<           "add": 20
<         }
<       ],
<       "auraName": "Wall Protection",
<       "auraDescription": "Units on walls have 3 extra Armor levels and higher vision."
<     },
<     "temple_heal": {
<       "type": "range",
<       "radius": 40,
<       "affects": [
<         "Human"
<       ],
<       "modifications": [
<         {
<           "value": "Health/RegenRate",
<           "add": 1
<         }
<       ],
<       "auraName": "Healing Aura",
<       "auraDescription": "Heals nearby units at 1 HP per second.",
<       "overlayIcon": "art/textures/ui/session/auras/heal.png"
<     },
<     "wonder_pop_1": {
<       "type": "global",
<       "affects": [
<         "Player"
<       ],
<       "modifications": [
<         {
<           "value": "Player/MaxPopulation",
<           "add": 10
<         }
<       ],
<       "auraName": "Wonder Aura",
<       "auraDescription": "Increase the population limit by 10 per Wonder owned.",
<       "stackable": true
<     },
<     "wonder_pop_2": {
<       "type": "global",
<       "affects": [
<         "Player"
<       ],
<       "modifications": [
<         {
<           "value": "Player/MaxPopulation",
<           "add": 40
<         }
<       ],
<       "auraName": "Glorious Expansion Aura",
<       "auraDescription": "Further increase the population limit by 40 per Wonder owned (requires \"Glorious Expansion\" tech).",
<       "requiredTechnology": "pop_wonder",
<       "stackable": true
<     },
<     "maur_pillar": {
<       "type": "range",
<       "radius": 75,
<       "affects": [
<         "Trader"
<       ],
<       "modifications": [
<         {
<           "value": "UnitMotion/WalkSpeed",
<           "multiply": 1.2
<         }
<       ],
<       "auraDescription": "All traders in range +20% walk speed.",
<       "overlayIcon": "art/textures/ui/session/auras/build_bonus.png"

Furthermore in nearly every multiplayer game that was started there was an instant OOS between the host and all clients, no matter which platform. I had compared 3 textual oos dumps, and there have always been techs missing and sometimes reordered, like in the diff above.

comment:6 by elexis, 8 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.