Opened 4 years ago

Closed 4 years ago

#5660 closed defect (fixed)

Unknown player has rejoined - part 2 - undefined property g_PlayerAssignments[msg.guid]

Reported by: elexis Owned by: elexis
Priority: Should Have Milestone: Alpha 24
Component: Network Keywords:
Cc: Patch: Phab:D2559

Description

In an svn match with r23332 earlier today, Freagarach encountered the following error message after a successful rejoin:

WARNING: JavaScript warning: gui/session/chat/ChatMessageFormatNetwork.js line 62
reference to undefined property g_PlayerAssignments[msg.guid]

ERROR: JavaScript error: gui/session/chat/ChatMessageFormatNetwork.js line 62
TypeError: g_PlayerAssignments[msg.guid] is undefined
  parse@gui/session/chat/ChatMessageFormatNetwork.js:62:1
  parseMessage@gui/session/chat/ChatMessageHandler.js:71:14
  handleMessage@gui/session/chat/ChatMessageHandler.js:53:19
  addChatMessage@gui/session/messages.js:465:2
  g_NetMessageTypes.rejoined@gui/session/messages.js:64:1
  handleNetMessages@gui/session/messages.js:387:4
  onTick@gui/session/session.js:586:2
  __eventhandler35 (tick)@session tick:0:1

I could not reproduce this error by rejoining with 2 clients simultaneously.

It sounds a lot like #4036, but that was fixed by changing the NetServer and we haven't touched that after the last release.

The mainlog.html file logs the entire network traffic and states, so it can show the circumstances revealing how to reproduce the error.

Here some excerpts:

<p>Serializing game at turn 623 for rejoining player</p>
<p>Net message: ({guid:"19E4E25AA0042CD9", type:"rejoined"})</p>	Zwuckel
<p>Net client: Received message CPlayerAssignmentMessage { m_Hosts: {  { m_GUID: 06B22C8BC5285665, m_Name: user1, m_PlayerID: 4294967295, m_Status: 0 },  { m_GUID: 19E4E25AA0042CD9, m_Name: Zwuckel (1352), m_PlayerID: 4294967295, m_Status: 0 },  { m_GUID: 3903DED5BBC83598, m_Name: Angen, m_PlayerID: 5, m_Status: 2 },  { m_GUID: 563FED5596CC00D9, m_Name: nani (1671), m_PlayerID: 2, m_Status: 2 },  { m_GUID: 86EBCE577765DA1F, m_Name: vladislavbelov, m_PlayerID: 4294967295, m_Status: 0 },  { m_GUID: 8EB4B2CCE06D9B24, m_Name: go2die (1505), m_PlayerID: 6, m_Status: 2 },  { m_GUID: 9F3043A29B6888C2, m_Name: Issh (1377), m_PlayerID: 7, m_Status: 2 },  { m_GUID: A54A39E3E6203EBD, m_Name: Freagarach, m_PlayerID: 3, m_Status: 0 },  { m_GUID: C4B173A791D0E8A2, m_Name: elexis3 (1223), m_PlayerID: 1, m_Status: 2 },  { m_GUID: F010027B7849988F, m_Name: bb (1506), m_PlayerID: 4, m_Status: 1 },  { m_GUID: FA7F35C5F69CE978, m_Name: ValihrAnt (2153), m_PlayerID: 4294967295, m_Status: 0 } } } of size 511 from server</p>
<p>Serializing game at turn 1010 for rejoining player</p>
<p>Net client: Received message CPlayerAssignmentMessage { m_Hosts: {  { m_GUID: 06B22C8BC5285665, m_Name: user1, m_PlayerID: 4294967295, m_Status: 0 },  { m_GUID: 19E4E25AA0042CD9, m_Name: Zwuckel (1352), m_PlayerID: 4294967295, m_Status: 0 },  { m_GUID: 3903DED5BBC83598, m_Name: Angen, m_PlayerID: 5, m_Status: 2 },  { m_GUID: 563FED5596CC00D9, m_Name: nani (1671), m_PlayerID: 2, m_Status: 2 },  { m_GUID: 5D5733F1032B9B7E, m_Name: Emperior (1465), m_PlayerID: 4294967295, m_Status: 0 },  { m_GUID: 86EBCE577765DA1F, m_Name: vladislavbelov, m_PlayerID: 4294967295, m_Status: 0 },  { m_GUID: 8EB4B2CCE06D9B24, m_Name: go2die (1505), m_PlayerID: 6, m_Status: 2 },  { m_GUID: 9F3043A29B6888C2, m_Name: Issh (1377), m_PlayerID: 7, m_Status: 2 },  { m_GUID: A54A39E3E6203EBD, m_Name: Freagarach, m_PlayerID: 3, m_Status: 0 },  { m_GUID: C4B173A791D0E8A2, m_Name: elexis3 (1223), m_PlayerID: 1, m_Status: 2 },  { m_GUID: F010027B7849988F, m_Name: bb (1506), m_PlayerID: 4, m_Status: 1 },  { m_GUID: FA7F35C5F69CE978, m_Name: ValihrAnt (2153), m_PlayerID: 4294967295, m_Status: 0 } } } of size 565 from server</p>
<p>Serializing game at turn 1010 for rejoining player</p>
<p>Net message: ({guid:"A54A39E3E6203EBD", type:"rejoined"})</p>	Freagarach
<p>Net message: ({guid:"5D5733F1032B9B7E", type:"rejoined"})</p>	Emperior
<p>Serializing game at turn 1219 for rejoining player</p>
<p>Net message: ({guid:"6B5C89BB5508E2F5", type:"rejoined"})</p>	Zwuckel
<p>Serializing game at turn 2414 for rejoining player</p>
<p>Net message: ({guid:"B3E1927B9FE1FBF3", type:"rejoined"})</p>	nani
<p>Serializing game at turn 2679 for rejoining player</p>
<p>Net message: ({guid:"B1963EB370356C7D", type:"rejoined"})</p>	vladislavbelov
<p>Serializing game at turn 2800 for rejoining player</p>
<p>Net message: ({guid:"411522958EA6885F", type:"rejoined"})</p>	vladislavbelov

We observe that Freagarach and Emperior started a rejoin on the same turn, most likely while the game has been paused.

This makes it seem plausible that Freagarach didnt have Emperior in the playerassignments when finishing the rejoin, or perhaps Emperior left before, some combination of states and events that isn't supported for some reason.

(If we can't reproduce it, it might or might not be a regression, it could also be a case in the NetServer we haven't discovered yet.)

mainlog.html presumably contains private data and can be sent privately on demand.

Change History (5)

comment:1 by elexis, 4 years ago

From Freagarachs mainlog.html:

Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, player:-1, name:"user1"}, '19E4E25AA0042CD9':{status:0, player:-1, name:"Zwuckel (1352)"}, '3903DED5BBC83598':{status:2, player:5, name:"Angen"}, '563FED5596CC00D9':{status:2, player:2, name:"nani (1671)"}, '86EBCE577765DA1F':{status:0, player:-1, name:"vladislavbelov"}, '8EB4B2CCE06D9B24':{status:2, player:6, name:"go2die (1505)"}, '9F3043A29B6888C2':{status:2, player:7, name:"Issh (1377)"}, A54A39E3E6203EBD:{status:0, player:3, name:"Freagarach"}, C4B173A791D0E8A2:{status:2, player:1, name:"elexis3 (1223)"}, F010027B7849988F:{status:1, player:4, name:"bb (1506)"}, FA7F35C5F69CE978:{status:0, player:-1, name:"ValihrAnt (2153)"}}, type:"players"})

Net client: Received message CJoinSyncStartMessage { } of size 3 from server

Net: Sending message CFileTransferRequestMessage { m_RequestID: 1 } of size 7 to server
....

Net: Sending message CFileTransferAckMessage { m_RequestID: 1, m_NumPackets: 1 } of size 11 to server

Download completed

Net client: Received message CPlayerAssignmentMessage { m_Hosts: { { m_GUID: 06B22C8BC5285665, m_Name: user1, m_PlayerID: 4294967295, m_Status: 0 }, { m_GUID: 19E4E25AA0042CD9, m_Name: Zwuckel (1352), m_PlayerID: 4294967295, m_Status: 0 }, { m_GUID: 3903DED5BBC83598, m_Name: Angen, m_PlayerID: 5, m_Status: 2 }, { m_GUID: 563FED5596CC00D9, m_Name: nani (1671), m_PlayerID: 2, m_Status: 2 }, { m_GUID: 5D5733F1032B9B7E, m_Name: Emperior (1465), m_PlayerID: 4294967295, m_Status: 0 }, { m_GUID: 86EBCE577765DA1F, m_Name: vladislavbelov, m_PlayerID: 4294967295, m_Status: 0 }, { m_GUID: 8EB4B2CCE06D9B24, m_Name: go2die (1505), m_PlayerID: 6, m_Status: 2 }, { m_GUID: 9F3043A29B6888C2, m_Name: Issh (1377), m_PlayerID: 7, m_Status: 2 }, { m_GUID: A54A39E3E6203EBD, m_Name: Freagarach, m_PlayerID: 3, m_Status: 0 }, { m_GUID: C4B173A791D0E8A2, m_Name: elexis3 (1223), m_PlayerID: 1, m_Status: 2 }, { m_GUID: F010027B7849988F, m_Name: bb (1506), m_PlayerID: 4, m_Status: 1 }, { m_GUID: FA7F35C5F69CE978, m_Name: ValihrAnt (2153), m_PlayerID: 4294967295, m_Status: 0 } } } of size 565 from server

Net message: ({type:"start"})

Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, player:-1, name:"user1"}, '19E4E25AA0042CD9':{status:0, player:-1, name:"Zwuckel (1352)"}, '3903DED5BBC83598':{status:2, player:5, name:"Angen"}, '563FED5596CC00D9':{status:2, player:2, name:"nani (1671)"}, '5D5733F1032B9B7E':{status:0, player:-1, name:"Emperior (1465)"}, '86EBCE577765DA1F':{status:0, player:-1, name:"vladislavbelov"}, '8EB4B2CCE06D9B24':{status:2, player:6, name:"go2die (1505)"}, '9F3043A29B6888C2':{status:2, player:7, name:"Issh (1377)"}, A54A39E3E6203EBD:{status:0, player:3, name:"Freagarach"}, C4B173A791D0E8A2:{status:2, player:1, name:"elexis3 (1223)"}, F010027B7849988F:{status:1, player:4, name:"bb (1506)"}, FA7F35C5F69CE978:{status:0, player:-1, name:"ValihrAnt (2153)"}}, type:"players"})

...

Net message: ({guid:"5D5733F1032B9B7E", type:"rejoined"})

WARNING: JavaScript warning: gui/session/chat/ChatMessageFormatNetwork.js line 62 reference to undefined property g_PlayerAssignments[msg.guid]

So it was indeed Emperiors GUID that triggered the error. If I'm not mistaken, there were no other player assignments sent between the error and the cited player assignments message.

comment:2 by elexis, 4 years ago

grep "\"start\"\|newAssignments\|WARNING" /tmp/log.html

<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
<p>Net message: ({type:"start"})</p>
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
<p class="warning">WARNING: JavaScript warning: gui/session/chat/ChatMessageFormatNetwork.js line 62
<p class="warning">WARNING: JavaScript warning: gui/session/chat/ChatMessageFormatPlayer.js line 49
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 

grep "newAssignments" /tmp/log.html | wc -l

5

grep "CPlayerAssignmentMessage" /tmp/log.html | wc -l

5

grep "newAssignments" /tmp/log.html -A12

without unrelated lines:

<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
player:-1, name:"user1"}, '19E4E25AA0042CD9':{status:0, player:-1, 
name:"Zwuckel (1352)"}, '3903DED5BBC83598':{status:2, player:5, 
name:"Angen"}, '563FED5596CC00D9':{status:2, player:2, name:"nani 
(1671)"}, '86EBCE577765DA1F':{status:0, player:-1, 
name:"vladislavbelov"}, '8EB4B2CCE06D9B24':{status:2, player:6, 
name:"go2die (1505)"}, '9F3043A29B6888C2':{status:2, player:7, 
name:"Issh (1377)"}, A54A39E3E6203EBD:{status:0, player:3, 
name:"Freagarach"}, C4B173A791D0E8A2:{status:2, player:1, name:"elexis3 
(1223)"}, F010027B7849988F:{status:1, player:4, name:"bb (1506)"}, 
FA7F35C5F69CE978:{status:0, player:-1, name:"ValihrAnt (2153)"}}, 
type:"players"})</p>
--
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
player:-1, name:"user1"}, '19E4E25AA0042CD9':{status:0, player:-1, 
name:"Zwuckel (1352)"}, '3903DED5BBC83598':{status:2, player:5, 
name:"Angen"}, '563FED5596CC00D9':{status:2, player:2, name:"nani 
(1671)"}, '5D5733F1032B9B7E':{status:0, player:-1, name:"Emperior 
(1465)"}, '86EBCE577765DA1F':{status:0, player:-1, 
name:"vladislavbelov"}, '8EB4B2CCE06D9B24':{status:2, player:6, 
name:"go2die (1505)"}, '9F3043A29B6888C2':{status:2, player:7, 
name:"Issh (1377)"}, A54A39E3E6203EBD:{status:0, player:3, 
name:"Freagarach"}, C4B173A791D0E8A2:{status:2, player:1, name:"elexis3 
(1223)"}, F010027B7849988F:{status:1, player:4, name:"bb (1506)"}, 
FA7F35C5F69CE978:{status:0, player:-1, name:"ValihrAnt (2153)"}}, 
type:"players"})</p>
--
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
player:-1, name:"user1"}, '3903DED5BBC83598':{status:2, player:5, 
name:"Angen"}, '563FED5596CC00D9':{status:2, player:2, name:"nani 
(1671)"}, '5D5733F1032B9B7E':{status:0, player:-1, name:"Emperior 
(1465)"}, '86EBCE577765DA1F':{status:0, player:-1, 
name:"vladislavbelov"}, '8EB4B2CCE06D9B24':{status:2, player:6, 
name:"go2die (1505)"}, '9F3043A29B6888C2':{status:2, player:7, 
name:"Issh (1377)"}, A54A39E3E6203EBD:{status:0, player:3, 
name:"Freagarach"}, C4B173A791D0E8A2:{status:2, player:1, name:"elexis3 
(1223)"}, F010027B7849988F:{status:1, player:4, name:"bb (1506)"}, 
FA7F35C5F69CE978:{status:0, player:-1, name:"ValihrAnt (2153)"}}, 
type:"players"})</p>
--
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
player:-1, name:"user1"}, '3903DED5BBC83598':{status:2, player:5, 
name:"Angen"}, '563FED5596CC00D9':{status:2, player:2, name:"nani 
(1671)"}, '5D5733F1032B9B7E':{status:0, player:-1, name:"Emperior 
(1465)"}, '6B5C89BB5508E2F5':{status:0, player:-1, name:"Zwuckel 
(1352)"}, '86EBCE577765DA1F':{status:0, player:-1, 
name:"vladislavbelov"}, '8EB4B2CCE06D9B24':{status:2, player:6, 
name:"go2die (1505)"}, '9F3043A29B6888C2':{status:2, player:7, 
name:"Issh (1377)"}, A54A39E3E6203EBD:{status:0, player:3, 
name:"Freagarach"}, C4B173A791D0E8A2:{status:2, player:1, name:"elexis3 
(1223)"}, F010027B7849988F:{status:1, player:4, name:"bb (1506)"}, 
FA7F35C5F69CE978:{status:0, player:-1, name:"ValihrAnt (2153)"}}, 
type:"players"})</p>
--
<p>Net message: ({newAssignments:{'06B22C8BC5285665':{status:0, 
player:-1, name:"user1"}, '3903DED5BBC83598':{status:2, player:5, 
name:"Angen"}, '563FED5596CC00D9':{status:2, player:2, name:"nani 
(1671)"}, '5D5733F1032B9B7E':{status:0, player:-1, name:"Emperior 
(1465)"}, '6B5C89BB5508E2F5':{status:0, player:-1, name:"Zwuckel 
(1352)"}, '86EBCE577765DA1F':{status:0, player:-1, 
name:"vladislavbelov"}, '8EB4B2CCE06D9B24':{status:2, player:6, 
name:"go2die (1505)"}, '9F3043A29B6888C2':{status:2, player:7, 
name:"Issh (1377)"}, A54A39E3E6203EBD:{status:0, player:3, 
name:"Freagarach"}, C4B173A791D0E8A2:{status:2, player:1, name:"elexis3 
(1223)"}, F010027B7849988F:{status:1, player:4, name:"bb (1506)"}}, 
type:"players"})</p>
<p>Net client: Received message CChatMessage { m_GUID: 563FED5596CC00D9, m_Message: 898 } of size 31 from server</p>

So Emperiors GUID 5D5733F1032B9B7E was presented in all PlayerAssignments except for the first one, indicating that the "rejoined" chat message was produced with the PlayerAssignments of the first PlayerAssignments message, and that was the one that happened before the start message, but the server did send the Emperior-containing message after the start package and before the error occurred, indicating a clientside error instead of a serverside error.

comment:3 by elexis, 4 years ago

Fak I know what happened:

The pollAndHandleNetworkClient in gamesetup_mp.js handled the "start" message, then performed the Engine.SwitchGuiPage to page_loading.xml with the g_PlayerAssignments from the at then first message.

Since there is only a break and not a return after the SwitchPage, and since the loop in pollAndHandleNetworkClient is only left when PollNetworkClient returns a falsy value, and since the PlayerAssignments message was received just before the start message, the second PlayerAssignments message containing the just rejoined Emperior was still processed in the gamesetup_mp pollAndHandleNetworkClient function after the SwitchPage call, rather than in the session page, thus leading to the g_PlayerAssignments variable in the gamesetup_mp file being replaced with the new assignments and then that JS context dying as that was the last GUI message processed in the stack and the pollAndHandleNetworkClient function call ending.

So one obvious fix would be changing the break to a return for the start message processor in gamesetup_mp, and the commit introducing that break would be the one that introduced the error.

That would be r10452.

The error is expected to occur this way:

  1. Client1 starts to rejoin
  2. While Client1 is still downloading the savegame state, Client2 starts to download

...

Wondering however why this never happened before!? It sounds like it would be a race condition, where the second client starts the rejoin just before (nearly simultaneously) the first client finishes the download.

I absolutely failed to reproduce this error by simulating lag and hoping to start a 2nd rejoin when the 1st rejoin is at 99% download status.

The only thing I could show is that the reported error occurs if one changes the code to artifically dont switch the page until a 2nd player starts the rejoin this way:

Index: /home/elexis/code/0ad-svn/trunk/binaries/data/mods/public/gui/gamesetup_mp/gamesetup_mp.js
===================================================================
--- /home/elexis/code/0ad-svn/trunk/binaries/data/mods/public/gui/gamesetup_mp/gamesetup_mp.js	(revision 23360)
+++ /home/elexis/code/0ad-svn/trunk/binaries/data/mods/public/gui/gamesetup_mp/gamesetup_mp.js	(working copy)
@@ -1,3 +1,5 @@
+var g_SwitchPage;
+
 /**
  * Whether we are attempting to join or host a game.
  */
@@ -181,7 +183,11 @@
 				break;
 
 			case "players":
+				if (g_SwitchPage)
+					Engine.SwitchGuiPage("page_loading.xml", g_SwitchPage);
+
 				g_PlayerAssignments = message.newAssignments;
+
 				break;
 
 			case "start":
@@ -192,12 +198,11 @@
 					if (player.player > 0)	// not observer or GAIA
 						g_GameAttributes.settings.PlayerData[player.player - 1].Name = player.name;
 				}
-
-				Engine.SwitchGuiPage("page_loading.xml", {
+				g_SwitchPage = {
 					"attribs": g_GameAttributes,
 					"isRejoining": g_IsRejoining,
 					"playerAssignments": g_PlayerAssignments
-				});
+				};
 				break;
 
 			case "chat":

So while I don't have the entire experimental reproduction, we know that the break must be a return and that that error has the potential to cause exactly that stacktrace.

comment:4 by elexis, 4 years ago

Milestone: BacklogAlpha 24
Owner: set to elexis
Patch: Phab:D2559

comment:5 by elexis, 4 years ago

Resolution: fixed
Status: newclosed

In 23401:

Don't continue to process pending netmessages in the gamesetup_mp page after having switched to the session page already.

If there was a player rejoining, a "start" and a "playerassignments" message pending for that player, the playerassignments message would not update the assignments of the session page, but the one of the obsolete gamesetup_mp page, thus triggering "unknown player" messages for that player during the session.
This seems to only happen when two players rejoin simultaneously and with very rare timing / race condition, and thus was not reproducible without modifying the code.
Issue since rP10452.

Differential Revision: https://code.wildfiregames.com/D2559
Should fix #5660.

Note: See TracTickets for help on using tickets.