Opened 10 years ago

Closed 10 years ago

#2808 closed defect (fixed)

Out of memory / possible leaks

Reported by: historic_bruno Owned by:
Priority: Release Blocker Milestone: Alpha 17
Component: Core engine Keywords:
Cc: Patch:

Description

Basically every Windows player crashed in our multiplayer games today, due to running out of memory, but this is more likely about 32-bit address space than the OS being the problem. Memory usage creeps up in the lobby by 1MB/s at times.

Attachments (7)

crashlog.rar (21.8 KB ) - added by Itms 10 years ago.
crashlog.txt (22.2 KB ) - added by Pureon 10 years ago.
My crashlog from the same game
memory_leaks.log (186.8 KB ) - added by Yves 10 years ago.
valgrind_mempool.diff (1.4 KB ) - added by leper 10 years ago.
wip, still some remaining issues, but a lot cleaner output
massif_output_1v1_singleplayer.txt (54.1 KB ) - added by Yves 10 years ago.
ShrinkingGCHack.diff (2.5 KB ) - added by Yves 10 years ago.
A hack that calls very frequent shrinking GCs for testing
GC_Scheduling_PATCH2_WIP_v1.diff (4.5 KB ) - added by Yves 10 years ago.
An attempt to prevent situations where SpiderMonkey needs to trigger GC because it's low on memory. (for testing)

Download all attachments as: .zip

Change History (14)

comment:1 by historic_bruno, 10 years ago

I have a full memory dump available if it helps, it is about ~300 MB compressed.

by Itms, 10 years ago

Attachment: crashlog.rar added

by Pureon, 10 years ago

Attachment: crashlog.txt added

My crashlog from the same game

comment:2 by Yves, 10 years ago

In 15787:

Tunes GC scheduling a bit to reduce memory usage.

The main problem was that GC was only called from the simulation before this patch. This means when you were waiting in the multiplayer lobby or just had the GUI open, it only called GC when getting close to the JS runtime size limit (I assume). Another problem was the Net Server runtime which didn't GC either. Here the runtime size limit is 16 MB though, so it's not too terrible. These issues have both been addressed and GC has been given a bit more time per incremental slice to make sure it gets done in time. It's still far from perfect, but there are too many changes in SpiderMonkey related to GC, so I don't want to spend too much time on this yet.

Refs #2808

comment:3 by Yves, 10 years ago

The main problem doesn't seem to be related to GC. Today after playing a long multiplayer game, memory usage of pyrogenesis was at 1.2 GB on the main menu. It looks like there are really some memory leaks.

I've used valgrind with the following command to get information about memory leaks: valgrind --smc-check=all-non-file --track-origins=yes --show-leak-kinds=definite --leak-check=full ./pyrogenesis 2> memory_leaks.log

I've started a short singleplayer game on the Acropolis default map against 1 AI. It was quite short because the performance was terrible with the memchecking enabled (<1FPS). The log is attached. I'm not used to analyzing this output, so I don't know what yet if these are real memory leaks or false positives.

by Yves, 10 years ago

Attachment: memory_leaks.log added

comment:4 by historic_bruno, 10 years ago

It may not truly be a "leak" if it gets freed before the game shuts down. You could try using a memory analysis tool to see what is allocating memory, how much and when. I'm doing something like that on Windows for #2735.

by leper, 10 years ago

Attachment: valgrind_mempool.diff added

wip, still some remaining issues, but a lot cleaner output

comment:5 by Yves, 10 years ago

  • The VFS uses a cache size of 500 MiB on my system which is the maximum value (check ChooseCacheSize in GameSetup.cpp). For the replay mode, the size is hard-coded to 20 MiB. Apparently this cache is also used for textures, so I expect the whole cache to be used up quite easily.
  • SpiderMonkey does not de-commit memory (give it back to the OS) in the current setup, it just marks it as "free" internally during GC. This generally means that after a long game, the runtime size will be close to the maximum of 384 MB and will not go down anymore. I've added a patch that enables very aggressive (frequent) Shrinking GCs for testing purpose.

So if we add only these two causes, we get around 900 MB, which is already quite close to the 1.2 GB I observed in the lobby after the first match.

by Yves, 10 years ago

Attachment: ShrinkingGCHack.diff added

A hack that calls very frequent shrinking GCs for testing

by Yves, 10 years ago

An attempt to prevent situations where SpiderMonkey needs to trigger GC because it's low on memory. (for testing)

comment:6 by Yves, 10 years ago

In 15831:

Modify GC scheduling and reduce VFS cache size.

It seems like there is a memory leak if we haven't finished with the marking phase of an incremental GC and SpiderMonkey has to trigger a full GC because it runs out of memory. With this patch we stop trying to make incremental GCs if we are above 1/2 of the runtime size and do Full GCs instead. This should make such low memory conditions even less likely than they were already after the previous patch. Also reduce the maximum VFS cache size to 400 MB.
Refs #2808

comment:7 by historic_bruno, 10 years ago

Resolution: fixed
Status: newclosed

This is greatly improved, people aren't reporting OOM errors like they were. In general, the game's memory use needs to be carefully examined (for fragmentation and unnecessary allocs), but this one issue was more about SpiderMonkey's GC and the cache size, which Yves addressed. We didn't find any memory leaks of note.

Note: See TracTickets for help on using tickets.