Opened 9 years ago

Closed 7 years ago

#3133 closed defect (needsinfo)

[OOS on rejoin] different order in changedTemplateInfo

Reported by: elexis Owned by:
Priority: Should Have Milestone:
Component: UI & Simulation Keywords:
Cc: Patch:

Description (last modified by elexis)

The usual setup. Someone disconnects, rejoins and the oos error appears.

1. The first difference in the two oos dumps can be seen in the changedTemplateInfo section:

    "320": [
      {
        "variable": "Health/Max",
        "value": 144
      }
    ],

This one appears in both oos_dumps, but in different positions!?

2. (local entities) A bunch of units are missing in the log of the rejoined player (hapoo).

3. (local entities) As a consequence of 2., the entity ids are different in the entities after that.

Attachments (2)

oos_dump_commands_txt.7z (414.4 KB ) - added by elexis 9 years ago.
oos_dump_r16863.7z (261.6 KB ) - added by elexis 9 years ago.
Here another oos_dump for the svn version.

Download all attachments as: .zip

Change History (18)

by elexis, 9 years ago

Attachment: oos_dump_commands_txt.7z added

comment:1 by elexis, 9 years ago

Description: modified (diff)

comment:2 by elexis, 9 years ago

Extracted from commands.txt all techs researched:

armor_infantry_01
armor_infantry_02
attack_infantry_melee_01
attack_infantry_ranged_01
attack_infantry_ranged_02
attack_tower_range
attack_tower_watch
gather_animals_stockbreeding
gather_capacity_basket
gather_capacity_wheelbarrow
gather_farming_plows
gather_farming_training
gather_lumbering_ironaxes
gather_lumbering_sharpaxes
gather_lumbering_strongeraxes
gather_mining_serfs
gather_mining_servants
gather_mining_shaftmining
gather_mining_wedgemallet
gather_wicker_baskets
phase_city_athen
phase_city_generic
phase_town_athen
phase_town_generic
pop_house_01
speed_cavalry_01
training_conscription
unlock_champion_units
vision_outpost

comment:3 by elexis, 9 years ago

binaries/data/mods/public/simulation/data/technologies/phase_city.town and phase_city.json reveals a health multiplier of 1.2. Both multiplied give 1.44.

	"modifications": [{"value": "TerritoryInfluence/Radius", "multiply": 1.50},
		{"value": "Health/Max", "multiply": 1.2, "affects": "CitizenSoldier"}],

comment:4 by elexis, 9 years ago

As presented in #3181#comment:5 the different IDs don't cause oos (3). I assume that the different place for the health property (1) is also irrelevant wrt to the oos.

Then only (2) should cause the oos bug. Investigating those missing entities shows that the first entities are waypoint flags, i.e. irrelevant. This might indicate that the other entities are also not relevant to the actual simulation state.

If you execute

diff -U 20 oos_dump_elexis.txt oos_dump_hapoo.txt | grep anim | grep name | sort | uniq -c

You will find out that they are all dead:

     29      anim name: "death"
    280 -    anim name: "death"
      6 -    anim name: "idle"

As shown by

diff -U 20 oos_dump_elexis.txt oos_dump_hapoo.txt | grep anim -A 10 -B 10 | grep idle -A 10 -B 10

the other 6 idle entities are 2x stone and 4x waypoint flags:

(local entities didn't cause the oos)

Last edited 9 years ago by elexis (previous) (diff)

comment:5 by elexis, 9 years ago

Replying to [ #3181/comment:7 historic_bruno ]:

Just FYI, those entities are "local" and not part of the simulation hash / OOS check. Building previews, corpses, rally point entities, etc. that are local to each player are local entities. They are included in the OOS dump in case they influence the synchronized state, which shouldn't happen. Typically you can ignore differences in local entities.

In this case the missing entities are (2) waypoint flags, corpses and stone. If none of them did cause the oos, then it must be (1) the health property which is in both logs, but just in a different place (in the same list).

So it basically looks like none of the differences should cause an oos.

Maybe it was the stone (mirage?), which doesn't exist in the game of the rejoined player...

Last edited 9 years ago by elexis (previous) (diff)

comment:6 by elexis, 9 years ago

The rejoined client was a spectator, not a player, maybe thats important... The second log has no spectator, so thats not important.

Last edited 9 years ago by elexis (previous) (diff)

comment:7 by elexis, 9 years ago

It must have something to do with the AIInterface.prototype.OnGlobalValueModification function. In alpha 18, it was called even when there was no AI player.

It also seems to be the only function that changes this.changedEntityTemplateInfo, which is the array in the oos-log that contains an element in different positions for the two clients.

Notice that this element is added only after this (item != newValue) check:

if(item != newValue)
{
	if (!this.changedEntityTemplateInfo[ent])
		this.changedEntityTemplateInfo[ent] = [{ "variable" : value, "value" : newValue }];
	else
		this.changedEntityTemplateInfo[ent].push({ "variable" : value, "value" : newValue });
}

In the most recent version of AIInterface.js there is also a TODO:

/ TODO in some cases, we can have two opposite changes which bring us to the old value,
// and we should keep it. But how to distinguish it ?

About entity 320: It doesn't exist anymore at the point where the oosdump was produced. But if you replay the commands.txt file visually (see #9), you can see that it was a ptolemian cavalry archer.

The two phases (which increase the health points up to 144) were researched minute 11:59 and 20:37. I can't find the exact time when it died, (since my visual replay is oos too starting turn 55), but minute 14:44 the last command that was issued to that unit:

turn 1438 500
cmd 3 {"type":"research","entity":311,"template":"phase_town_generic"}

turn 1770 500
cmd 3 {"type":"walk","entities":[320,3776,3780,3966,3967,3969,4088,4089,4090,4123,4238],"x":415.5475769042969,"z":70.91799926757812,"queued":false}

turn 2474 500
cmd 3 {"type":"research","entity":311,"template":"phase_city_generic"}

by elexis, 9 years ago

Attachment: oos_dump_r16863.7z added

Here another oos_dump for the svn version.

comment:8 by elexis, 9 years ago

Description: modified (diff)

comment:9 by elexis, 9 years ago

Summary: [OOS on rejoin] maxHealth and missing units[OOS on rejoin] different order in changedTemplateInfo

In the second oos_dump, the only difference is again in the changedEntityTemplateInfo section:

>     "288": [
>       {
>         "variable": "Builder/Rate",
>         "value": 1.1
>       },
>       {
>         "variable": "ResourceGatherer/BaseSpeed",
>         "value": 1.1
>       }
>     ],
>     "289": [
>       {
>         "variable": "Builder/Rate",
>         "value": 1.1
>       },
>       {
>         "variable": "ResourceGatherer/BaseSpeed",
>         "value": 1.1
>       }
>     ],
>     "290": [
>       {
>         "variable": "Builder/Rate",
>         "value": 1.1
>       },
>       {
>         "variable": "ResourceGatherer/BaseSpeed",
>         "value": 1.1
>       }
>     ],
>     "291": [
>       {
>         "variable": "Builder/Rate",
>         "value": 1.1
>       },
>       {
>         "variable": "ResourceGatherer/BaseSpeed",
>         "value": 1.1
>       }
>     ],
>     "292": [
>       {
>         "variable": "ResourceGatherer/BaseSpeed",
>         "value": 1.1
>       },
>       {
>         "variable": "ResourceGatherer/BaseSpeed",
>         "value": 1.1
>       }
>     ],
172490,172499d172539
<     "292": [
<       {
<         "variable": "ResourceGatherer/BaseSpeed",
<         "value": 1.1
<       },
<       {
<         "variable": "ResourceGatherer/BaseSpeed",
<         "value": 1.1
<       }
<     ],
172514,172523d172553
<     "288": [
<       {
<         "variable": "Builder/Rate",
<         "value": 1.1
<       },
<       {
<         "variable": "ResourceGatherer/BaseSpeed",
<         "value": 1.1
<       }
<     ],
172586,172615d172615
<     "289": [
<       {
<         "variable": "Builder/Rate",
<         "value": 1.1
<       },
<       {
<         "variable": "ResourceGatherer/BaseSpeed",
<         "value": 1.1
<       }
<     ],
<     "290": [
<       {
<         "variable": "Builder/Rate",
<         "value": 1.1
<       },
<       {
<         "variable": "ResourceGatherer/BaseSpeed",
<         "value": 1.1
<       }
<     ],
<     "291": [
<       {
<         "variable": "Builder/Rate",
<         "value": 1.1
<       },
<       {
<         "variable": "ResourceGatherer/BaseSpeed",
<         "value": 1.1
<       }
<     ],

comment:10 by elexis, 9 years ago

In the log above, the entities 288-292 are the initial 4 infantry units of the green player. Every now and then they were near women, so they got the 10% female inspiration aura bonus (value 1.1).

The MT_ValueModification message is sent by the AuraManager each time the bonus is added or removed.

For the first log, the MT_ValueModification was sent by the TechnologyManager, as the value was changed by the city phase.

The order of these messages must be different for the rejoined player, so AIInterface.prototype.OnGlobalValueModification is probably not to blame.

comment:11 by mimo, 9 years ago

In 16865:

do not fill changedTemplateInfo when no AIs to prevent OOS, refs #3133

comment:12 by mimo, 9 years ago

Component: Core engineUI & Simulation
Milestone: Alpha 19Alpha 20
Priority: Release BlockerShould Have

That's not a release blocker as should now (after r16865) only happen with AIs and rejoin with AIs is not (yet) supported.

It would nonetheless be worth to understand why the order of the messages received by AIInterface can be different. I push it to A20.

comment:13 by Itms, 9 years ago

#3375 might be the cause.

comment:14 by elexis, 9 years ago

Duplicate of #2863?

comment:15 by Itms, 8 years ago

Milestone: Alpha 20Backlog

comment:16 by elexis, 7 years ago

Description: modified (diff)
Milestone: Backlog
Resolution: needsinfo
Status: newclosed

Thanks Hapoofesgeli for uploading the OOS dump. We'll likely never figure out what happened, but can assume that it's fixed or an extremely rare edge case.

Note: See TracTickets for help on using tickets.