Opened 7 years ago

Last modified 6 years ago

#4641 new enhancement

Lagwarnings for lost packets

Reported by: elexis Owned by: elexis
Priority: Must Have Milestone: Backlog
Component: Network Keywords:
Cc: Patch:

Description

Abstract: Testing shows that two dropped packets measured by enet most often induces a lagspike (for about the duration it is seen in the profiler), so it should be shown (just like here comment:5:ticket:3264)


Existing ping-warnings:

r17730 implemented a warning text in the top right hand corner of the screen to all clients if some client's ping exceeded the required minimum ping for the game not to stutter.

In these 17 months of testing it has detected most laggers reliably.

Counterexample:

Vercingetorix somehow introduced lag to dozens of games without it showing up in the lagwarnings.

The characteristics of this specific lag are 500ms turns being simulated in 500ms, then 500-1500ms of simulation pause. The pause comes from not receiving the ack of that player to process the next turn.

So in order to play the first 20 minutes, you often had to sit 45 minutes in front of the screen and being annoyed 50% of the time of having to wait, in a frustrating repetition of 1-2 seconds.

The cpu/simulation not being able to compute turns fast enough ("simulation lag", see also #3890) can be excluded, because it often occurs right after the gamestart when players have 5 population and the map is empty.

Experiment: These games with unreasonable lag occured again today. Vercingetorix was very cooperative with me debugging the issue and joined my testhost. No other client connected, so the results are distinct.

Results: With F11 we can open the ingame profiler. This one also shows network statistics after pressing the button 4 times. No lag was perceived in the gamesetup (150-200ms ping, mostly 0 dropped packets). The map was set to a tiny anatolian plateau map. It has almost no entities, so it's the most performant map and both players could observe about 200 FPS. The initial cavalry was set to patroling, so it was easily possible to notice the slightest occurance of lag (because the unit would not move while having the move animation if there was a lagspike).

Unexpectedly, but luckily the lag-for-no-reason was reproduced for about 5-10 minutes until we had seen enough and the game was closed. 80% of the time there were 0 dropped packets, but 15% of the time there were 1-2 dropped packets and sometimes 3 (on my end). Vercingetorix had noticed roughly the same numbers, but sometimes even greater than what I had seen (for example I never saw 5 dropped packets in my network statistics.).

Comparison (1): The test with Vercingetorix itself gave a serious case of a correlation of dropped packets and lagspikes. Another 5min testgame was hosted with borg- and mvdiogo separately. More than 95% of the time there were 0 dropped packets, but on 3 occasions there were 2 dropped packets for all clients. The lags correlated again with the dropped packets (Once 2 packets were dropped without noticeable lag, but rarely false-positive occurs for lagwarnings too if the lag occured only for one turn). A different of these 3 occasions had shown a ping lagwarning as well.

Comparison (2): In a game with two unknown lobby players there were 0 dropped packets constantly and no lag was observed.

Some more inconclusive tests were done (because of the simulation being slow, many players connected but not being the host). A good comparison would be hosting a game with 16 clients and having all players but one lose only 0 to 1 packets while one exceeds that rate and disconnecting him fixing the lag.

Change History (1)

comment:1 by elexis, 6 years ago

Owner: set to elexis
Summary: Lagwarnings for dropped packetsLagwarnings for lost packets

Authored but not published (maybe yet), refs #3787.

The problem with this packet loss measure is that enet updates it only every 10 seconds and when it does, it still includes the packet loss ratio from times before then.

So when the application presents a new number to the user, he is mislead to believe that this number is the representation of the packet loss since the last update.

Quick changes in the packet loss don't manifest quickly enough. If there is 0% packet loss but then some seconds with 50%, it will display something like 2%.

If the user somehow changed his network settings and went from 20% loss to 0% loss, it will still report a bad connection for a minute or more.

Edit: and because these are constants, we can't change it in enet unless patching upstream.

Last edited 6 years ago by elexis (previous) (diff)
Note: See TracTickets for help on using tickets.