Opened 7 years ago

Closed 5 years ago

Last modified 5 years ago

#4616 closed defect (fixed)

Assertion failed: m_DestructionQueue.empty()

Reported by: elexis Owned by: wraitii
Priority: Must Have Milestone: Alpha 24
Component: Network Keywords:
Cc: Patch: Phab:D1738

Description

When hosting an svn game with r19751 that was paused, the following exception occured each time a client requested a serialized simstate (3 times):

ComponentManagerSerialization.cpp(186): Assertion failed: "m_DestructionQueue.empty()"
Assertion failed: "m_DestructionQueue.empty()"
Location: ComponentManagerSerialization.cpp:186 (SerializeState)

Call stack:

(0x92443e) ./pyrogenesis() [0x92443e]
(0x8d6479) ./pyrogenesis() [0x8d6479]
(0x8d7802) ./pyrogenesis() [0x8d7802]
(0x4a2ab8) ./pyrogenesis() [0x4a2ab8]
(0x437413) ./pyrogenesis() [0x437413]
(0x45c293) ./pyrogenesis() [0x45c293]
(0x43cde9) ./pyrogenesis() [0x43cde9]
(0x4317df) ./pyrogenesis() [0x4317df]
(0x422777) ./pyrogenesis() [0x422777]
(0x7fdc1c780830) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fdc1c780830]
(0x42f6b9) ./pyrogenesis() [0x42f6b9]

errno = 0 (Try again later)
OS error = ?

It was reported multiple times already, by mapkoc privately somewhere on 2016-11-20, by me on #0ad-dev on 2017-03-02 and by bb on 2017-04-17.

Since the game was paused today while the exception occured upon each rejoin attempt, the turn number could be figured out (by copying the replay of exactly that time, see attachment). However a rejointest (with the equivalent scenario map) doesn't reproduce the issue.

So I think it might be a local entity in the destruction queue at that time.

(Couldn't decode the stack anymore, but that should be irrelevant anyway)

Attachments (2)

replay_scenario_map.7z (196.4 KB ) - added by elexis 7 years ago.
mainlog.html.7z (1.1 MB ) - added by elexis 6 years ago.
My log copied as quickly as possible after the error occured while hosting.

Download all attachments as: .zip

Change History (11)

by elexis, 7 years ago

Attachment: replay_scenario_map.7z added

comment:1 by elexis, 6 years ago

(Still in alpha 23)

comment:2 by elexis, 6 years ago

Priority: Should HaveMust Have

by elexis, 6 years ago

Attachment: mainlog.html.7z added

My log copied as quickly as possible after the error occured while hosting.

comment:3 by elexis, 5 years ago

Component: Core engineNetwork

I didn't know INVALID_ENTITY (Phab:D1736) could be queued here.

But the bug sounds more like the host having paused before the current turn was finished, before the destructionqueue was cleaned, and then a rejoin started during that pause.

comment:4 by wraitii, 5 years ago

Based on the way the code is written:

Pausing is handled in the core CGame::Update, so pausing on any given turn finishes the Sim Update computation and cleans up the component queue, then calls the GUI update and then we loop outside the sim so a priori nothing can get deleted then.

I think the only way this can trigger is if the host clicks on a GUI item that will create a local entity on the same turn that someone pauses (or something like that) - basically this must be because the GUI update can lead to component flushing but it doesn't actually clear the queue.

So Flushing after the g_GUI->SendEventToAll("SimulationUpdate"); ought to fix this.

(in-writing edit) : this can be reproduced very easily by having selected a unit that can build something, pausing the game, then clicking on stuff to try and place building previews. You can see that the preview remains even after you've tried placing other buildings. That's because the entities are Destroyed, but not flushed. If you try to save the game, it crashes.

comment:5 by wraitii, 5 years ago

https://code.wildfiregames.com/D1738 is one way to fix this but the best solution should be discussed.

comment:6 by wraitii, 5 years ago

Milestone: BacklogAlpha 24

comment:7 by Stan, 5 years ago

Patch: Phab:D1738

comment:8 by wraitii, 5 years ago

Owner: set to wraitii
Resolution: fixed
Status: newclosed

In 22865:

Check only that the destruction queue contains no non-local entity when serializing the game state.

Local entities being in the destruction queue when serialising is not an issue since those should not affect the simulation anyways. This stops the game from crashing in some rare situations.

Fixes #4616

Differential Revision: https://code.wildfiregames.com/D1738

comment:9 by elexis, 5 years ago

#5583 for the building preview issue.

Note: See TracTickets for help on using tickets.