Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#4408 closed defect (fixed)

Atlas crashes on OS X Sierra (10.12)

Reported by: wraitii Owned by: wraitii
Priority: Release Blocker Milestone: Alpha 22
Component: Core engine Keywords:
Cc: Patch:

Description (last modified by elexis)

We've had several reports that Atlas crashes on OS X Sierra aka 10.12 (but not on earlier versions, as it is working for me).

After some debugging with trompetin, it appears to be caused by polling events in two different threads. Specifically, wxWidgets polls events in the original thread (in wxTheApp->OnRun in DLLInterface.cpp) and SDL polls events in RunEngine(). This apparently is not liked by macOS Sierra.

I literally cannot figure out why we poll SDL events in RunEngine, as far as I can tell removing those 4 lines (220-222 of GameLoop.cpp) changes nothing. So that's what I'd suggest as a fix? Could people on other platform check this? And also if anybody knows what those two lines are doing, I'd be happy to know.

Attachments (2)

dont_poll_SDL_events_v1.patch (576 bytes ) - added by elexis 7 years ago.
fixAtlas.patch (1.9 KB ) - added by wraitii 7 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 by elexis, 7 years ago

Component: UI & SimulationCore engine
Keywords: rfc removed

Reported by nigel87 on 2016-11-23, shortly before the a21 release. At least on 2016-11-11 he reported to use OSX 10.12.1.

2016-11-23 15:40:22.831 pyrogenesis[254:6837963] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'nextEventMatchingMask should only be called from the Main Thread!'
*** First throw call stack:
(
	0   CoreFoundation                      0x00007fff8a73448b __exceptionPreprocess + 171
	1   libobjc.A.dylib                     0x00007fff9ee96cad objc_exception_throw + 48
	2   AppKit                              0x00007fff88a4750c -[NSApplication(NSEvent) shouldBeTreatedAsInkEvent:] + 0
	3   pyrogenesis                         0x000000011053c2b5 Cocoa_PumpEvents + 229
	4   pyrogenesis                         0x000000011048047d SDL_PumpEvents_REAL + 45
	5   pyrogenesis                         0x0000000110480505 SDL_WaitEventTimeout_REAL + 53
	6   pyrogenesis                         0x00000001104804c7 SDL_PollEvent_REAL + 23
	7   pyrogenesis                         0x000000011046bda4 SDL_PollEvent + 36
	8   pyrogenesis                         0x000000010faee538 _ZL9RunEnginePv + 1608
	9   libsystem_pthread.dylib             0x00007fff9f98caab _pthread_body + 180
	10  libsystem_pthread.dylib             0x00007fff9f98c9f7 _pthread_body + 0
	11  libsystem_pthread.dylib             0x00007fff9f98c221 thread_start + 13
)
libc++abi.dylib: terminating with uncaught exception of type NSException
Abort trap: 6

(We assumed it was an issue with his OS, as he didn't get svn to compile before upgrading from 10.11)

rfc only for patches.

by elexis, 7 years ago

comment:2 by elexis, 7 years ago

Description: modified (diff)
Keywords: rfc added

Patch by wraitii (filename was missing from the ticket description)

by wraitii, 7 years ago

Attachment: fixAtlas.patch added

comment:3 by wraitii, 7 years ago

As understood with Philip, this should fix the issue while keeping full functionality. Will test when I upgrade to 10.12

comment:4 by wraitii, 7 years ago

Upgraded to Sierra and tested. I can confirm the original bug and I can confirm the above patch fixes the crash. If other platforms are OK with the patch, I'd like to commit it ASAP.

comment:5 by Stan, 7 years ago

Windows is okay on my end. Debug & Release Builds. Windows 10 64Bits

comment:6 by wraitii, 7 years ago

Owner: set to wraitii
Resolution: fixed
Status: newclosed

In 19160:

Fix Atlas under macOS Sierra.
We were polling SDL messages from two different threads (by mistake), and Sierra now refuses to do that.
Tested by Stan, Itms, Fatherbushido, and discussed with Philip for the code change itself.
Fixes #4408.
Differential Revision: https://code.wildfiregames.com/D42

comment:7 by elexis, 7 years ago

Keywords: rfc removed

comment:8 by elexis, 7 years ago

The relevant parts of the discussion refered in the commit message from 2016-12-14:

16:41 < wraitii> Philip: any idea why you polled events in the atlas main loop?
19:54 < Philip> wraitii: I don't really understand the question, but I think one reason for weird event handling is that OS X (maybe Cocoa specifically) only sends messages to the main thread, and SDL sets itself up as the main thread (as a wrapper around the game's main()), but wxWidgets needs to receives UI messages
19:55 < wraitii> Philip: the question is "why do you call sdl_pollEvent" in "RunEngine" in atlas
19:55 < wraitii> since that's not in the main thread
19:55 < Philip> so we ended up doing some thread shuffling thing, where the game's main thread calls into Atlas and then runs the wxWidgets event loop in that thread, while spawning a second thread to continue running the game engine on
19:57 < wraitii> to be extremely specific: why this: http://trac.wildfiregames.com/browser/ps/trunk/source/tools/atlas/GameInterface/GameLoop.cpp?annotate=blame#L221
19:57 < wraitii> (nb: original code is from 2010 so maybe it's just a relic?)
19:58 < Philip> The comment there says it's for "e.g. hotkeys" - if I remember correctly, Atlas receives key events from wx and forwards them to the game thread, and the game thread might then synthesise some hotkey events, so it needs to be told to process those events too
19:58 < wraitii> it breaks on OSX Sierra because Sierra doesn't like polling events from 2 threads, so I want to remove it altogether (since it seems useless)
19:59 < wraitii> then I'll check for synthetizing events and see if it can be scrapped or not
19:59 < Philip> It should only need to poll the internal event queue, not the SDL event queue (because nothing should be sending SDL events), I think
19:59 < wraitii> that code is specifically callin SDL_PollEvent
20:00 < wraitii> there's also some code above for msgPasser
20:00 < wraitii> not sure what you're regerring to with "internal event queue"
20:00 < Philip> I mean the priority_events in lib/input.cpp
20:00 < Philip> which is where hotkey events etc are pushed
20:01 < wraitii> I'll check
20:01 < Philip> so it should (I guess) still poll and dispatch from that queue, but not call SDL_PollEvent
20:01 < Philip> (like add a in_poll_priority_event or something)
20:03 < Philip> That stuff was added in r7602 back when hotkey events were pushed onto the SDL event queue, not onto a priority_queue
20:03 < WildfireBot> r7602. Author: philip. Commit message: Make session GUI visible and usable when playtesting from Atlas. Fix Atlas icon on non-Windows. Remove some nonexistent-hotkey config.  http://trac.wildfiregames.com/changeset/7602
20:03 < Philip> (which meant hotkeys got processed in the wrong order)
20:04 < Philip> which is why it called SDL_PollEvent originally
20:04 < Philip> but as far as I can see, it wasn't expected to process any other type of SDL events
20:05 < Philip> and now we probably(?) don't push anything onto the SDL queue any more, so there's no need to use SDL_PollEvent
20:05 < Philip> s/priority_queue/priority_events/
20:08 < Philip> (grep for SDL_PushEvent if you want to check that)
20:17 < Philip> I think the only other person who looked at the event stuff was maybe historicbruno? when fixing the OS X only-main-thread-gets-events thing
20:17 < Philip> (Atlas event stuff, that is)
20:17 < Philip> since that was a blocker for getting Atlas to work on OS X at all
Note: See TracTickets for help on using tickets.