Opened 18 months ago

Last modified 15 months ago

#6699 new defect

[macOS] Memory issue (sandboxed, spindump, ReportCrash) Spidermonkey 91

Reported by: Langbart Owned by:
Priority: Must Have Milestone: Backlog
Component: Core engine Keywords:
Cc: Patch: Phab:D4911

Description (last modified by Langbart)

to consistently reproduce (1)

  • add this debug message in chatHelper.js
  • binaries/data/mods/public/simulation/ai/petra/chatHelper.js

    a b PETRA.chatNewDiplomacy = function(gameState, player, newDiplomaticStance)  
    220220
    221221PETRA.chatAnswerRequestDiplomacy = function(gameState, player, requestType, response, requiredTribute)
    222222{
     223    warn(uneval(['debug', gameState, player, requestType, response, requiredTribute]))
    223224    Engine.PostCommand(PlayerID, {
    224225        "type": "aichat",
    225226        "message": "/msg " + gameState.sharedScript.playersData[player].name + " " +
  • start the attached replay from ticket #6654 (~2mins long)
  • the game freezes at the 97th turn
  • forcefully quit the pyrogenesis app
  • check activity monitor for spindump
    • For the case described above, the process is always called spindump to allocate large amounts of memory, but the process can also be called sandboxed or ReportCrash in other situations.

Note: To be absolutely sure, restart the PC test it again.

to consistently reproduce (2)

  • pass this to the terminal and run
    pyrogenesis -replay-visual="/Users/paria\ crash"
    
  • the game crashes with the following message

  • check activity monitor for ReportCrash
    • same as above with spindump, if I don`t stop the process manually, the PC becomes unusable with time

to consistently reproduce (3)

  • the sandboxed process occurs when I try to start a replay from the built-in command line within the VSCodium (based of VSCode) app.
    pyrogenesis -replay-visual=<here is your path to a legit replay, does not matter which one>
    

bisect

[27409]

workaround

  • Go into Recovery mode cmd+R, select the Terminal
    • it needs to happen in Recovery mode because you can't edit those files below in a normal session
  • navigate to /Volumes/{The name of your drive}/System/Library/
# disable the ReportCrash issue
mv LaunchAgents/com.apple.ReportCrash.plist LaunchAgents/com.apple.ReportCrash.bak
mv LaunchDaemons/com.apple.ReportCrash.Root.plist LaunchDaemons/com.apple.ReportCrash.Root.plist.bak

# disable the sandboxed issue
mv LaunchDaemons/com.apple.sandboxd.plist LaunchDaemons/com.apple.sandboxd.plist.bak 

# disable the spindumb issue
mv LaunchDaemons/com.apple.spindumb.plist LaunchDaemons/com.apple.spindumb.plist.bak

related ticket

Attachments (9)

terminal.png (54.1 KB ) - added by Langbart 18 months ago.
term_report.png (79.8 KB ) - added by Langbart 18 months ago.
ReportCrash.png (31.7 KB ) - added by Langbart 18 months ago.
out.png (267.7 KB ) - added by Langbart 18 months ago.
spindump.txt (942.5 KB ) - added by Langbart 18 months ago.
sandboxed.txt (163.9 KB ) - added by Langbart 18 months ago.
ReportCrash.txt (1.9 MB ) - added by Langbart 18 months ago.
mozalloc_abort.png (42.1 KB ) - added by Langbart 18 months ago.
mac_no_door.png (191.5 KB ) - added by Langbart 18 months ago.

Change History (52)

by Langbart, 18 months ago

Attachment: terminal.png added

comment:1 by Stan, 18 months ago

In your case it's printing the game state thousands of times that crashes the game. Is it just a way to make it faster? Can't be vulkan if it happened before. So it's Spidermonley most likely but it's probably cause we're doing something it does not like

in reply to:  1 comment:2 by Langbart, 18 months ago

Replying to Stan‘:

In your case it's printing the game state thousands of times that crashes the game. Is it just a way to make it faster? Can't be vulkan if it happened before. So it's Spidermonley most likely but it's probably cause we're doing something it does not like

the issue is not about the lldb (ignore that), it about an external processing spinning up called spindump.

comment:3 by wraitii, 18 months ago

@langbart -> Did this happen to you in regular play? Or just when adding some debugs?

comment:4 by Stan, 18 months ago

spindump and sandboxd are macOS diagnostic applications apparently.

Last edited 18 months ago by Stan (previous) (diff)

in reply to:  3 comment:5 by Langbart, 18 months ago

Replying to wraitii:

@langbart -> Did this happen to you in regular play? Or just when adding some debugs?

  • I have also noticed it in a regular game but have not found consistently reproducible steps.

comment:6 by Langbart, 18 months ago

Description: modified (diff)

The description was extended by another possibility to reproduce the problem consistently.

by Langbart, 18 months ago

Attachment: term_report.png added

by Langbart, 18 months ago

Attachment: ReportCrash.png added

comment:7 by Langbart, 18 months ago

Description: modified (diff)
  • Remove bisect number
    • conflicting test results, could still have been due to spidermonkey
    • reproducing the steps with the released A26 version did not show any problems, so the problem must be with macOS or somewhere in the development cycle of A27
  • lldb has been removed as it is not needed for the problem.
  • Reproduction steps remain valid

comment:8 by Langbart, 18 months ago

Installed macOS Dependencies

❯ uname -a
Darwin Paria 19.6.0 Darwin Kernel Version 19.6.0: Tue Jun 21 21:18:39 PDT 2022; root:xnu-6153.141.66~1/RELEASE_X86_64 x86_64

❯ xcode-select --version
xcode-select version 2373.

❯ rustc --version
rustc 1.66.1 (90743e729 2023-01-10) (built from a source tarball)

❯ cmake --version
cmake version 3.25.2

in reply to:  7 comment:9 by Vladislav Belov, 18 months ago

Replying to Langbart:

  • Remove bisect number

I suppose the name is outdated then.

comment:10 by Stan, 18 months ago

Have you updated macOS recently? Both examples you mentioned produce actual crashes maybe it's a requirement. E.g you second example is just making it crash with an unsupported path

in reply to:  10 comment:11 by Langbart, 18 months ago

  • terminal output: fresh install with --force-rebuild on library folder (latest GIT [27471])
  • issue remains as described in the description
  • Rename the title without including the alleged changeset, since it is no longer certain that this is the reason for the problem

Replying to Stan‘:

Have you updated macOS recently?

  • always macOS 10.15.7
  • only rust/cmake are updated via homebrew when new updates are released

Both examples you mentioned produce actual crashes maybe it's a requirement. E.g you second example is just making it crash with an unsupported path

The crash doesn't matter, it's just the trigger. It's about the external process eating up my memory. Crashes happened with A26 in testing as well, but never causes this external process to allocate all my memory.

comment:12 by Langbart, 18 months ago

Summary: [macOS] Memory issue with 27412[macOS] Memory issue (sandboxed, spindump, ReportCrash)

comment:13 by Langbart, 18 months ago

two videos

A26 release version

  • all good, no over allocation of memory after the crash

https://ttm.sh/S0T.5x.mp4

current pyrogenesis build

  • the problem occurs as described in the description

https://ttm.sh/S0p.5x.mp4

comment:14 by Stan, 18 months ago

Does it do something different in release or debug modes? I know it's tedious but since it seems reproductible can you try to bisect again?

I'll see if I can try on my macos (i have monterey)

We need to figure out why macos does weird stuff with the game. When you say A26 you mean the bundle or a version you compiled yourself?

Last edited 18 months ago by Stan (previous) (diff)

in reply to:  14 comment:15 by Langbart, 18 months ago

Replying to Stan‘:

When you say A26 you mean the bundle or a version you compiled yourself?

Yes, the official version downloaded with Homebrew.

spindump points at [27412], ReportCrash pointing at [27409]

I don't know how to bisect that.


the sandboxed issue occurs when I try to start a replay from the built-in command line within the VSCodium (based of VSCode) app.

Last edited 18 months ago by Langbart (previous) (diff)

comment:16 by Stan, 18 months ago

Could you try a self compiled A26? What if you run outside of vscodium?

Last edited 18 months ago by Stan (previous) (diff)

comment:17 by Stan, 18 months ago

Just tested on macos, I cannot reproduce.

in reply to:  16 comment:18 by Langbart, 18 months ago

Replying to Stan‘:

Could you try a self compiled A26?

is not on my wish list today, I don't think I need to. I compiled with [27408] and none of the problems listed here occur. All problems came with [27409] and above.

What if you run outside of vscodium?

then the sandboxed issue does not occur


for the ReportCrash process it would fill my Console (macOS default installed application) with out of bounds abstract origin messages and keeps going till i kill the process

Last edited 18 months ago by Langbart (previous) (diff)

by Langbart, 18 months ago

Attachment: out.png added

comment:19 by Stan, 18 months ago

Maybe vscode is sandboxed. Is there any more info about those warnings

comment:20 by Langbart, 18 months ago

Yes, there is the possibility to create a sample from the Activity Monitor for any process. Here are the samples that are created once I triggered the issue.

by Langbart, 18 months ago

Attachment: spindump.txt added

by Langbart, 18 months ago

Attachment: sandboxed.txt added

by Langbart, 18 months ago

Attachment: ReportCrash.txt added

comment:21 by Langbart, 18 months ago

Description: modified (diff)

comment:22 by Vladislav Belov, 18 months ago

Could you update to r27412 and revert the only vulkan folder from it?

in reply to:  14 comment:23 by Langbart, 18 months ago

Replying to Stan‘:

Does it do something different in release or debug modes?

ReportCrash issue

  • the line below that starts with Hit MOZ_CRASH()... was revealed
    ❯ pyrogenesis_dbg -replay-visual="/Users/paria\ crash"
    Path /Users/paria\ crash, separator /
    Function call failed: return value was -100303 (path contains both slash and backslash separators)
    Location: path.h:292 (DetectSeparator)
    
    Call stack:
    
    (error while dumping stack: Function not supported)
    errno = 0 (No error reported here)
    OS error = ?
    
    
    (C)ontinue, (S)uppress, (B)reak, Launch (D)ebugger, or (E)xit?
    e
    Redirecting call to abort() to mozalloc_abort
    
    Hit MOZ_CRASH() at /Users/paria/Developer/0ad/libraries/source/spidermonkey/mozjs-91.13.1/memory/mozalloc/mozalloc_abort.cpp:33
    zsh: segmentation fault  pyrogenesis_dbg -replay-visual="/Users/paria\ crash"
    

sandboxed and spindump issue

  • pyrogenesis_dbg freezes or is just too slow to reproduce the problem in this mode (see #6702)
Last edited 18 months ago by Langbart (previous) (diff)

by Langbart, 18 months ago

Attachment: mozalloc_abort.png added

comment:24 by Stan, 18 months ago

Too bad what about Vlad's suggestion ?

in reply to:  17 ; comment:25 by Langbart, 18 months ago

Replying to Stan‘:

Just tested on macos, I cannot reproduce.

You have tried all three ways to trigger the problem and no process has appeared that has accumulated large amounts of memory?

Replying to Vladislav Belov:

Could you update to r27412 and revert the only vulkan folder from it?

so, but ends fatal.

# hash equivalent to [27412]
git checkout bec344d807e83ce3d05ffd5713c5a53dcaea7ad2

# confirm we are on the right changset
git log -1 | grep -Eo "git-svn-id: [^ ]+"
# git-svn-id: https://svn.wildfiregames.com/public/ps/trunk@27412

# revert only the vulkan folder chnage
git checkout HEAD~1 source/renderer/backend/vulkan/

# start building
build 

...

fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
3 errors generated.
make[1]: *** [obj/graphics_Release/DeviceCommandContext2.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [obj/graphics_Release/Buffer2.o] Error 1
10 errors generated.
make[1]: *** [obj/graphics_Release/DescriptorManager.o] Error 1
make: *** [graphics] Error 2
Last edited 18 months ago by Langbart (previous) (diff)

comment:26 by Stan, 18 months ago

Only 2) For now (I'm mostly AFK during the day). Will try 1) later today. 3) Sounds like something that is because of vscode but maybe I'm wrong.

in reply to:  25 comment:27 by Vladislav Belov, 18 months ago

Replying to Langbart:

so, but ends fatal.

It seems the added files weren't removed completely. In the source/renderer/backend/vulkan folder should be only 3 files: Device.h, Device.cpp and DeviceForward.h.

comment:28 by Langbart, 18 months ago

Description: modified (diff)
Summary: [macOS] Memory issue (sandboxed, spindump, ReportCrash)[macOS] Memory issue (sandboxed, spindump, ReportCrash) Spidermonkey 91
  • in short, [27412] did not cause any of the problems described here, to finally test spindumb you also have to restart the PC, this led me to falsely blame [27412]
  • everything described here (spinddumb, sandboxed and the ReportCrash issue), came with [27409]
  • The only thing left to do is to try a macOS update and test again
Last edited 18 months ago by Langbart (previous) (diff)

comment:29 by Stan, 18 months ago

Method 1

Applied

  • binaries/data/mods/public/simulation/ai/petra/chatHelper.js

     
    220220
    221221PETRA.chatAnswerRequestDiplomacy = function(gameState, player, requestType, response, requiredTribute)
    222222{
     223    warn(uneval(['debug', gameState, player, requestType, response,
     224requiredTribute]))
    223225    Engine.PostCommand(PlayerID, {
    224226        "type": "aichat",
    225227        "message": "/msg " + gameState.sharedScript.playersData[player].name + " " +

Ran

$ ./pyrogenesis -replay-visual="../../../Downloads/commands.txt"

Note to self: we don't support ~ in path. The game didn't crash. I've got no spindump, sandboxd, or ReportCrash (since it didn't crash I assume) Instead I get turn 97:

ERROR: JavaScript error: simulation/ai/petra/chatHelper.js line 223 allocation size overflow PETRA.chatAnswerRequestDiplomacy@simulation/ai/petra/chatHelper.js:223:10 PETRA.DiplomacyManager.prototype.checkEvents@simulation/ai/petra/diplomacyManager.js:217:10 PETRA.DiplomacyManager.prototype.update@simulation/ai/petra/diplomacyManager.js:524:7 PETRA.HQ.prototype.update@simulation/ai/petra/headquarters.js:2290:24 PETRA.PetraBot.prototype.OnUpdate@simulation/ai/petra/_petrabot.js:118:11 m.BaseAI.prototype.HandleMessage@simulation/ai/common-api/baseAI.js:64:7

Then it continues as expected considering this fatal error albeit really slowly.

Turn 97 (200)...
ERROR: JavaScript error: simulation/ai/petra/chatHelper.js line 223
allocation size overflow
  PETRA.chatAnswerRequestDiplomacy@simulation/ai/petra/chatHelper.js:223:10
  PETRA.DiplomacyManager.prototype.checkEvents@simulation/ai/petra/diplomacyManager.js:217:10
  PETRA.DiplomacyManager.prototype.update@simulation/ai/petra/diplomacyManager.js:524:7
  PETRA.HQ.prototype.update@simulation/ai/petra/headquarters.js:2290:24
  PETRA.PetraBot.prototype.OnUpdate@simulation/ai/petra/_petrabot.js:118:11
  m.BaseAI.prototype.HandleMessage@simulation/ai/common-api/baseAI.js:64:7
Turn 98 (200)..

Method 2

Ran

$ ./pyrogenesis -replay-visual="/Users/paria\ crash"

Got:

Path /Users/paria\ crash, separator /
Function call failed: return value was -100303 (path contains both slash and backslash separators)
Location: path.h:292 (DetectSeparator)

Call stack:

(error while dumping stack: Function not supported)
errno = 0 (No error reported here)
OS error = ?


(C)ontinue, (S)uppress, (B)reak, Launch (D)ebugger, or (E)xit?

Pressed c.

Got:

path.h(292): Function call failed: return value was -100303 (path contains both slash and backslash separators)
ERROR: The requested replay file '/Users/paria\ crash' does not exist!
stan@MacBook-Pro-de-Stanislas system % 

I've got no spindump, sandboxd, or ReportCrash

Method 3

Open VSCode

$  ./pyrogenesis -replay="../../../Downloads/commands.txt"

I've got no spindump, sandboxd, or ReportCrash processes.

I'm using macOS 12.6

comment:30 by Langbart, 18 months ago

Description: modified (diff)

Ok, Stan. I also did not find anything on spidermonkey.dev and bugzilla.mozilla.org about those processes being mentioned in the context described here.

  • will try out macOS Ventura in the coming weeks and share the results
  • move the ticket to Backlog for now ?

  • Added workaround to disable these processes manually to the description

comment:31 by Stan, 18 months ago

I'd like you to try a bundle first if you do not mind? I will try to upload it when I get home. Thanks again for the very detailed reports means a lot to me :)

in reply to:  31 comment:32 by Langbart, 18 months ago

Description: modified (diff)

Replying to Stan‘:

I'd like you to try a bundle first if you do not mind?

Ok


  • added links to spidermonkey and bugzilla to the description

comment:33 by Stan, 18 months ago

There you go. https://releases.wildfiregames.com/rc/0ad-0.0.27.1-alpha-aarch64.dmg

  • I tried to sign it but it obviously didn't work so you'll have to xattr -cr
  • I tried to bundle molten vk, but it failed too.
  • Let me know if it triggers the same weird behavior.

Thanks a bunch!

Last edited 18 months ago by Stan (previous) (diff)

in reply to:  33 comment:34 by Langbart, 18 months ago

Replying to Stan‘:

There you go.

System info

intel mac, will not allow me to

on the releases.wildfiregames.com there are two versions:

0ad-0.0.26-alpha-osx-aarch64.dmg    2022-09-24  1564.88 MB  md5 sha1    minisig torrent http
0ad-0.0.26-alpha-osx64.dmg  2022-09-24  1581.57 MB  md5 sha1    minisig torrent http

on the releases.wildfiregames.com/rc/ is just one

0ad-0.0.27.1-alpha-aarch64.dmg  2023-01-22  1649.85 MB  md5 sha1    http

by Langbart, 18 months ago

Attachment: mac_no_door.png added

comment:35 by Stan, 18 months ago

Really sorry will make an intel one in a few hours

comment:36 by Stan, 18 months ago

Tell me if this one works https://releases.wildfiregames.com/rc/0ad-0.0.27.1-alpha-x86_64.dmg

Vulkan should work too if you enable the bundled 0ad-spirv mod :)

in reply to:  36 comment:37 by Langbart, 18 months ago

Replying to Stan‘:

rc/0ad-0.0.27.1-alpha-x86_64.dmg

None of the problems described here can be reproduced with the app you sent me.
Did you use 10.12 ?

wraitii wrote in #6193

From what I can tell, this doesn't happen:

  • when compiled against the 10.12 SDK (as does the CI)

comment:38 by Stan, 18 months ago

Yes I did use 10.12. :)

in reply to:  38 comment:39 by Langbart, 18 months ago

Replying to Stan‘:

Yes I did use 10.12. :)

Shall I close it and link to #6193 ?

  • This quite annoying bug, have to keep memory on close watch or disable some LaunchDaemons/LaunchAgents.
  • will try a macOS update this week
Last edited 18 months ago by Langbart (previous) (diff)

comment:40 by Stan, 18 months ago

Milestone: Alpha 27Backlog
Priority: Release BlockerMust Have

I'm gonna leave it open as it seems slightly different. But at least we have a workaround now. Thank you so much for your time on this, I hope we'll find a more satisfactory fix.

comment:41 by phosit, 15 months ago

Patch: Phab: D4911

Add a possible fix.

comment:42 by Stan, 15 months ago

Patch: Phab: D4911Phab:D4911

comment:43 by phosit, 15 months ago

Can the second issue also be triggered if you run pyrogenesis -replay="/Users/paria\ crash"?

If so, can it also be triggeret if you insert a ENSURE(false); before replayFile is declared in source/main.cpp L525 and running pyrogenesis without arguments?

Note: See TracTickets for help on using tickets.