Opened 4 years ago

Closed 4 years ago

Last modified 11 days ago

#3386 closed defect (fixed)

[PATCH] Lobby - freeze and duplicate users when returning from a match

Reported by: elexis Owned by: JoshuaJB
Priority: Must Have Milestone: Alpha 19
Component: Multiplayer lobby Keywords: patch
Cc: josh Patch:

Description

Bug Symptoms

(1) 0 A.D. freezes for a couple of seconds when returning to the lobby after playing a long game. (2) The playerlist sometimes contains online users twice and sometimes offline users once (while it should only display online users and no dupes).

Why the bug occurs

(1) Accumulation of thousands of presence updates The XmppClient caches all presence updates in its m_GuiMessageQueue variable. Usually the lobby code fetches all updates onTick, clearing that array frequently. But while playing a game, the lobby JS code is inactive. Therefore m_GuiMessageQueue accumulates over the course of a game, resulting in the application of thousands of those updates when returning to the lobby. Thus causing the freeze mentioned in (1).

(2) Applying those updates to a different playerlist The second symptom is caused by the accumulation of these updates and applying them after fetching a clean playerlist. The presence updates in m_GuiMessageQueue are only valid for the playerlist of the time when they were generated. But since we replace the playerlist with the most current one when returning from the lobby, these updates don't apply anymore!

This for example hits all these cases:

if (nickIndex == -1) // Left, but not present (TODO: warn about this?)
if (nickIndex == -1) // Changed nick, but not present (shouldn't ever happen)
if (nickIndex == -1) // Changed presence, but not online (shouldn't ever happen)

This is essentially the bug discovered in #3171 (ticket description and comments 1 to 9). All the patches there attempt to fix this problem by clearing the messages. But the patches after version 2 didn't correctly handle the g_joined bug described in #3287 and fixed in r16961.

Attachments (5)

t3386_clear_muc_messages_v1.patch (10.1 KB) - added by elexis 4 years ago.
Needs testing, but should be equivalent to the muc-message-clearing patches in #3171 which were tested extensively in real situations with a18.
t3386_clear_muc_messages_v1.1.patch (10.1 KB) - added by elexis 4 years ago.
Fixed include order.
t3386_clear_muc_messages_v2.patch (12.4 KB) - added by elexis 4 years ago.
Adds error messages for invalid updates. As mentioned in the comment above, I'll take a look at simplifying the relevant code.
t3386_fix_lobby_rejoin_v3.patch (13.1 KB) - added by elexis 4 years ago.
Implements Joshuas proposal and always calls updatePlayerList instead of doing this useless / duplicate / broken code in session.js. In contrast to prior patches, only presence updates will be cleared when returning from the lobby. This way you will still get notifications in the chat area who has joined and left while you were playing. The new function LobbyGetMucMessageCount was implemented, so that we call updatePlayerList only when receiving the last update in the queue. Notice that we don't need the error messages anymore, since we know the C++ queue always has it right and since we removed the duplicate playerlist logic in session.js.
t3386_fix_lobby_rejoin_v3.1.patch (13.0 KB) - added by elexis 4 years ago.
Removed wrong comment.

Download all attachments as: .zip

Change History (18)

Changed 4 years ago by elexis

Needs testing, but should be equivalent to the muc-message-clearing patches in #3171 which were tested extensively in real situations with a18.

comment:1 Changed 4 years ago by elexis

Keywords: review added

comment:2 Changed 4 years ago by Itms

I didn't test it either, just looked at the code. It is nice except for your change in the file includes: IXmppClient.h is the associated header so it should stay first, then an empty line, then the STL includes (deque), then an empty line, then our includes in alphabetical order. So just switch deque and IXmppClient and add a new line after them.

Changed 4 years ago by elexis

Fixed include order.

comment:3 Changed 4 years ago by elexis

Reproduce with two lobby accounts:

(a) "Joined but already present" account 1: join lobby account 1: start a game account 2: join lobby account 1: stop the game

(b) "Left but not present" account 1: join lobby account 2: join lobby account 1: start a game account 2: leave lobby account 1: stop the game

Since those are errors indeed and cause the dupes, I have added error() messages in case that still happens in the following patch.


Actually when I look at the code, I think Joshua is right and we should just replace the playerlist with a clean one everytime something changes in it. Having updates for individual users doesn't give us any performance advantage as we can copy the whole list from C++ to JS without any real performance cost. When we update the playerlist GUI we also have to push the complete list everytime. Since we want to see "player X has joined" messages in the lobby, we still need these events and thus also have the error messages in case.

Changed 4 years ago by elexis

Adds error messages for invalid updates. As mentioned in the comment above, I'll take a look at simplifying the relevant code.

Changed 4 years ago by elexis

Implements Joshuas proposal and always calls updatePlayerList instead of doing this useless / duplicate / broken code in session.js. In contrast to prior patches, only presence updates will be cleared when returning from the lobby. This way you will still get notifications in the chat area who has joined and left while you were playing. The new function LobbyGetMucMessageCount was implemented, so that we call updatePlayerList only when receiving the last update in the queue. Notice that we don't need the error messages anymore, since we know the C++ queue always has it right and since we removed the duplicate playerlist logic in session.js.

Changed 4 years ago by elexis

Removed wrong comment.

comment:4 Changed 4 years ago by elexis

I tested version 2 and that does not work, because the messages are cleared each time we update the list, which is too often. If we have 2 updates in the queue, process one, then the other one is removed. Version 3 has addressed this problem by clearing the messages only when actually returning to the lobby (i.e. on init()).


Notice that the removal of the redundant JS logic also addresses the issue that the playerlist wasn't sorted after joins/leaves.

comment:5 Changed 4 years ago by elexis

Tested v3 in a18.

The playerlist renders instantaneously, the entries are always sorted, no dupes and the code looks very clean.

comment:6 Changed 4 years ago by Josh

Cc: josh added

comment:7 Changed 4 years ago by Josh

Keywords: reviewed added; review removed

The patch looks great! I'm glad to see better performance + less code. I'd recommend getting it committed.

comment:8 Changed 4 years ago by JoshuaJB

Owner: set to JoshuaJB
Resolution: fixed
Status: newclosed

In 16997:

Fix #3386, patch by elexis.

comment:9 Changed 3 years ago by sanderd17

Keywords: patch reviewedpatch, reviewed

comment:10 Changed 3 years ago by elexis

Keywords: reviewed removed

comment:11 Changed 2 years ago by elexis

In 20040:

Remove unneeded GetMucMessageCount? from the XmppClient?.

The affected function was imprecise, because it called a playerlist update whenever a "chat" level message was received instead
of only updating the playerlist if it's actually influencing the displayed playerlist.
When there is a chat message, lobby subject change or user-role change, there is no need to update the list.

Differential Revision: https://code.wildfiregames.com/D671
Refs #3386, rP16997
Reviewed By: fpre / ffffffff

comment:12 Changed 12 days ago by elexis

In 22855:

Drop lobby presence GUI messages altogether for better performance and less code complexity.

They were only used to determine whether the playerlist should be rebuilt, which can be achieved by a boolean member.
So this patch is removing unnecessary indirection from the original solution in rP16997 and the reduction in rP20040, refs #3386.
In particular the LobbyClearPresenceUpdates? call was ugly and one doesnt need hundreds of messages with data on it each if one only wants to know if there was one.
Makes past (rP20070) and future (D2264) patches less complex, refs #4877.

Differential Revision: https://code.wildfiregames.com/D2265
Tested on: gcc 9.1.0, clang 8.0.1, Jenkins

comment:13 Changed 11 days ago by elexis

In 22859:

Fix wrong "player is not a moderator anymore" lobby chat message following initial implementation in rP19514/D339, fixes #4877.

It needs to test against the historic role and not the current role, similar to rP16997, refs #3386.
Implements the third GUI message property using the arbitrary lobby GUI message patch in rP22856/D2264, refs #4482, rP22855/D2265.

Differential Revision: https://code.wildfiregames.com/D2266
Tested on: clang 8.0.1

Note: See TracTickets for help on using tickets.