Opened 3 years ago
Closed 3 years ago
#6225 closed defect (fixed)
Segfault in the single player match setup screen
Reported by: | tuxayo | Owned by: | wraitii |
---|---|---|---|
Priority: | Release Blocker | Milestone: | Alpha 25 |
Component: | Core engine | Keywords: | |
Cc: | Patch: | Phab:D4183 |
Description (last modified by )
Hi, I will try to reproduce. Here is the stack trace. I have a 450MiB coredump if that can help.
It doesn't happen every time, I was able to load 2 games.
Which additional info should I provide?
Version: master 84949d86264c6ab19b849b7390a04d4a3e952708
PID: 319257 (main) UID: 1000 (victor) GID: 100 (users) Signal: 11 (SEGV) Timestamp: Tue 2021-06-15 00:56:48 CEST (3min 0s ago) Command Line: /usr/bin/pyrogenesis Executable: /usr/bin/pyrogenesis Control Group: /user.slice/user-1000.slice/session-1.scope Unit: session-1.scope Slice: user-1000.slice Session: 1 Owner UID: 1000 (victor) Boot ID: 451fef7ad4684955958fdd3b07189582 Machine ID: 2bcabf87c4874e7994dda0a192e1be2f Hostname: some-laptop Storage: /var/lib/systemd/coredump/core.main.1000.451fef7ad4684955958fdd3b07189582.319257.1623711408000000.zst (present) Disk Size: 413.1M Message: Process 319257 (main) of user 1000 dumped core. Stack trace of thread 319257: #0 0x00007fd3bdf49834 mozalloc_abort (libmozjs78-ps-release.so + 0x9c0834) #1 0x00007fd3bd6750ce abort (libmozjs78-ps-release.so + 0xec0ce) #2 0x000055cdb0f60db9 n/a (pyrogenesis + 0xb8db9) #3 0x000055cdb13f235f n/a (pyrogenesis + 0x54a35f) #4 0x000055cdb13e0045 n/a (pyrogenesis + 0x538045) #5 0x000055cdb1441078 n/a (pyrogenesis + 0x599078) #6 0x000055cdb13d1819 n/a (pyrogenesis + 0x529819) #7 0x000055cdb13d1f4f n/a (pyrogenesis + 0x529f4f) #8 0x000055cdb14a680b n/a (pyrogenesis + 0x5fe80b) #9 0x000055cdb0f5ef48 n/a (pyrogenesis + 0xb6f48) #10 0x000055cdb0f4b500 n/a (pyrogenesis + 0xa3500) #11 0x00007fd3bc3f4b25 __libc_start_main (libc.so.6 + 0x27b25) #12 0x000055cdb0f5b53e n/a (pyrogenesis + 0xb353e) Stack trace of thread 319284: #0 0x00007fd3bc5b08ca __futex_abstimed_wait_common64 (libpthread.so.0 + 0x158ca) #1 0x00007fd3bc5aa270 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf270) #2 0x00007fd3bcae7108 _ZN2nv5Event4waitEv (libnvtt.so + 0x35108) #3 0x00007fd3bcae69d8 _ZN2nv10ThreadPool10workerFuncEPv (libnvtt.so + 0x349d8) #4 0x00007fd3bcae7292 threadFunc (libnvtt.so + 0x35292) #5 0x00007fd3bc5a4259 start_thread (libpthread.so.0 + 0x9259) #6 0x00007fd3bc4cb5e3 __clone (libc.so.6 + 0xfe5e3) Stack trace of thread 319283: #0 0x00007fd3bc5b08ca __futex_abstimed_wait_common64 (libpthread.so.0 + 0x158ca) #1 0x00007fd3bc5aa270 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf270) #2 0x00007fd3bcae7108 _ZN2nv5Event4waitEv (libnvtt.so + 0x35108) #3 0x00007fd3bcae69d8 _ZN2nv10ThreadPool10workerFuncEPv (libnvtt.so + 0x349d8) #4 0x00007fd3bcae7292 threadFunc (libnvtt.so + 0x35292) #5 0x00007fd3bc5a4259 start_thread (libpthread.so.0 + 0x9259) #6 0x00007fd3bc4cb5e3 __clone (libc.so.6 + 0xfe5e3) Stack trace of thread 319259: #0 0x00007fd3bc5b08ca __futex_abstimed_wait_common64 (libpthread.so.0 + 0x158ca) #1 0x00007fd3bc5aa270 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf270) #2 0x00007fd3bc7e7f01 __gthread_cond_wait (libstdc++.so.6 + 0xccf01) #3 0x000055cdb11e552c n/a (pyrogenesis + 0x33d52c) #4 0x00007fd3bc7ee3c4 execute_native_thread_routine (libstdc++.so.6 + 0xd33c4) #5 0x00007fd3bc5a4259 start_thread (libpthread.so.0 + 0x9259) #6 0x00007fd3bc4cb5e3 __clone (libc.so.6 + 0xfe5e3) Stack trace of thread 319266: #0 0x00007fd3bc4c3201 __select (libc.so.6 + 0xf6201) #1 0x000055cdb14b8d84 n/a (pyrogenesis + 0x610d84) #2 0x00007fd3bc5a4259 start_thread (libpthread.so.0 + 0x9259) #3 0x00007fd3bc4cb5e3 __clone (libc.so.6 + 0xfe5e3) Stack trace of thread 319261: #0 0x00007fd3bc5b08ca __futex_abstimed_wait_common64 (libpthread.so.0 + 0x158ca) #1 0x00007fd3bc5aa270 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf270) #2 0x00007fd3bc7e7f01 __gthread_cond_wait (libstdc++.so.6 + 0xccf01) #3 0x000055cdb11e552c n/a (pyrogenesis + 0x33d52c) #4 0x00007fd3bc7ee3c4 execute_native_thread_routine (libstdc++.so.6 + 0xd33c4) #5 0x00007fd3bc5a4259 start_thread (libpthread.so.0 + 0x9259) #6 0x00007fd3bc4cb5e3 __clone (libc.so.6 + 0xfe5e3) Stack trace of thread 319269: #0 0x00007fd3bc4c0b2f __poll (libc.so.6 + 0xf3b2f) #1 0x00007fd3bcb83407 n/a (libopenal.so.1 + 0x80407) #2 0x00007fd3b0e069a9 pa_mainloop_poll (libpulse.so.0 + 0x1c9a9) #3 0x00007fd3b0e11281 pa_mainloop_iterate (libpulse.so.0 + 0x27281) #4 0x00007fd3b0e11331 pa_mainloop_run (libpulse.so.0 + 0x27331) #5 0x00007fd3bcb8496e n/a (libopenal.so.1 + 0x8196e) #6 0x00007fd3bc7ee3c4 execute_native_thread_routine (libstdc++.so.6 + 0xd33c4) #7 0x00007fd3bc5a4259 start_thread (libpthread.so.0 + 0x9259) #8 0x00007fd3bc4cb5e3 __clone (libc.so.6 + 0xfe5e3) Stack trace of thread 319272: #0 0x00007fd3bc493a95 clock_nanosleep@@GLIBC_2.17 (libc.so.6 + 0xc6a95) #1 0x00007fd3bc498c77 __nanosleep (libc.so.6 + 0xcbc77) #2 0x00007fd3be38f8ac n/a (libSDL2-2.0.so.0 + 0x1228ac) #3 0x000055cdb1226b90 n/a (pyrogenesis + 0x37eb90) #4 0x00007fd3bc7ee3c4 execute_native_thread_routine (libstdc++.so.6 + 0xd33c4) #5 0x00007fd3bc5a4259 start_thread (libpthread.
Attachments (2)
Change History (32)
comment:2 by , 3 years ago
Description: | modified (diff) |
---|
comment:3 by , 3 years ago
Do you play with the A24 version or the SVN version? You could upload your log files - see /wiki/GameDataPaths.
comment:4 by , 3 years ago
The dev version. Exactly there: https://github.com/0ad/0ad/commit/84949d86264c6ab19b849b7390a04d4a3e952708
I'll check out the log files.
follow-up: 6 comment:5 by , 3 years ago
Any way you could add the debug symbols for pyrogenesis in the stack?
comment:6 by , 3 years ago
Replying to stanislas69:
Any way you could add the debug symbols for pyrogenesis in the stack?
What I've found so far is this:
https://trac.wildfiregames.com/wiki/Debugging#Debugsymbols https://trac.wildfiregames.com/wiki/BuildInstructions#Building
The Release mode builds (which are the default) are more optimised, but are harder to debug. Use make config=debug (and run pyrogenesis_dbg) if you need better debugging support. See Debugging for more details.
My build & packaging script is this one: https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=0ad-git#n43
Should make config=debug be ran before line 42?
comment:7 by , 3 years ago
You can try config=debug L46 actually. But I think it's likely you're compiling with debug symbols on, they're just not appearing. Can you try running the game under gdb / lldb ?
comment:8 by , 3 years ago
Milestone: | Backlog → Alpha 25 |
---|---|
Priority: | Should Have → Release Blocker |
Bumping to RB until we have more info
comment:9 by , 3 years ago
Summary: | Segfault in the single payer match setup screen → Segfault in the single player match setup screen |
---|
comment:10 by , 3 years ago
interestinglog.html is empty. Are the other files relevant?
oos_dump.dat
oos_dump.txt
system_info.txt
userreport_hwdetect.txt
by , 3 years ago
Attachment: | gdb-output.txt added |
---|
managed to reproduce, it seems there are no symbols. Recompilation in progress. gdb vs lldb: is there one whose output helps more to find the issue?
comment:11 by , 3 years ago
BTW, I don't know how to use these debuggers. I just do gdb pyrogenesis
and then run
And copy the output here.
follow-up: 13 comment:12 by , 3 years ago
(No debugging symbols found in pyrogenesis)
Shoots!
Here is the build part of the packaging script that I use. I added make config=debug
in two places. Does it makes sense?
build() { cd "$srcdir/${_pkgname}/build/workspaces" unset CPPFLAGS # for le spidermonkey export SDL2_CONFIG="pkg-config sdl2" ./update-workspaces.sh \ --bindir=/usr/bin \ --libdir=/usr/lib/0ad \ --datadir=/usr/share/${pkgname}/data cd "$srcdir/${_pkgname}/libraries/source/fcollada/src" # CUSTOM ↓ make config=debug make -j9 cd "$srcdir/${_pkgname}/build/workspaces/gcc" # CUSTOM ↓ make config=debug make -j9 }
follow-up: 14 comment:13 by , 3 years ago
managed to reproduce
Can you explain how?
Replying to tuxayo:
BTW, I don't know how to use these debuggers. I just do
gdb pyrogenesis
and thenrun
And copy the output here.
After coming back to the GDB "terminal", you can type "bt" (from backtrace) to show more info.
Replying to tuxayo:
(No debugging symbols found in pyrogenesis)
Shoots!
Here is the build part of the packaging script that I use. I added
make config=debug
in two places. Does it makes sense?
I think you are doing a debug make first and then a normal make. Can you try: make config=debug -j9
? I also think you don't need to make FCollada separately?
comment:14 by , 3 years ago
Replying to Freagarach:
managed to reproduce
Can you explain how?
Messing with the buttons, games types, maps, game option, civs, reset civs or teams. I only did it 3 times and still can't find a reliable test plan. I'll keep trying.
After coming back to the GDB "terminal", you can type "bt" (from backtrace) to show more info.
Thanks :)
I think you are doing a debug make first and then a normal make. Can you try:
make config=debug -j9
?
Ok, recompiling.
I also think you don't need to make FCollada separately?
I don't know what the package maintainer had in mind.
So make in build/workspaces/gcc will also take care of FCollada?
comment:15 by , 3 years ago
I have now pyrogenesis_dbg instead of pyrogenesis
Reading symbols from pyrogenesis_dbg... (No debugging symbols found in pyrogenesis_dbg)
noooooo
comment:17 by , 3 years ago
Ok I got a debug build with symbols thanks for the help :D Back to reproducing.
note: strip was in my global package building config.
comment:18 by , 3 years ago
And here is a backtrace! :D
TIMER| reference/common/setup.xml: 403.177 us /usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = float; _Alloc = std::allocator<float>; std::vector<_Tp, _Alloc>::reference = float&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed. Redirecting call to abort() to mozalloc_abort Hit MOZ_CRASH() at /home/victor/.cache/pikaur/build/0ad-git/src/0ad/libraries/source/spidermonkey/mozjs-78.6.0/memory/mozalloc/mozalloc_abort.cpp:33 Thread 1 "main" received signal SIGSEGV, Segmentation fault. mozalloc_abort (msg=<optimized out>) at /home/victor/.cache/pikaur/build/0ad-git/src/0ad/libraries/source/spidermonkey/mozjs-78.6.0/memory/mozalloc/mozalloc_abort.cpp:33 33 MOZ_CRASH(); (gdb) bt #0 mozalloc_abort (msg=<optimized out>) at /home/victor/.cache/pikaur/build/0ad-git/src/0ad/libraries/source/spidermonkey/mozjs-78.6.0/memory/mozalloc/mozalloc_abort.cpp:33 #1 0x00007ffff7861d84 in abort () at /home/victor/.cache/pikaur/build/0ad-git/src/0ad/libraries/source/spidermonkey/mozjs-78.6.0/memory/mozalloc/mozalloc_abort.cpp:82 #2 0x0000555555606239 in std::__replacement_assert (__file=__file@entry=0x555555b79278 "/usr/include/c++/11.1.0/bits/stl_vector.h", __line=__line@entry=1045, __function=__function@entry=0x555555bbb7a0 "std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = float; _Alloc = std::allocator<float>; std::vector<_Tp, _Alloc>::reference = f"..., __condition=__condition@entry=0x555555b7ba0b "__n < this->size()") at /usr/include/c++/11.1.0/x86_64-pc-linux-gnu/bits/c++config.h:504 #3 0x0000555555a3b09a in std::vector<float, std::allocator<float> >::operator[] (this=0x55555b2bd798, __n=<optimized out>) at /usr/include/c++/11.1.0/bits/stl_vector.h:1045 #4 std::vector<float, std::allocator<float> >::operator[] (__n=<optimized out>, this=0x55555b2bd798) at /usr/include/c++/11.1.0/bits/stl_vector.h:1043 #5 CDropDown::HandleMessage (this=0x55555b2bd2f0, Message=...) at ../../../source/gui/ObjectTypes/CDropDown.cpp:172 #6 0x0000555555a2a7c5 in IGUIObject::SendMouseEvent (this=0x55555b2bd2f0, type=type@entry=GUIM_MOUSE_PRESS_LEFT, eventName=...) at ../../../source/gui/ObjectBases/IGUIObject.cpp:364 #7 0x0000555555a865ba in CGUI::HandleEvent (this=<optimized out>, ev=ev@entry=0x7fffffffdaf0) at ../../../source/gui/CGUI.cpp:209 #8 0x0000555555a1dea8 in CGUIManager::HandleEvent (this=0x555557d75ef0, ev=0x7fffffffdaf0) at /usr/include/c++/11.1.0/bits/shared_ptr_base.h:1290 #9 0x0000555555a1e18f in gui_handler (ev=0x7fffffffdaf0) at ../../../source/gui/GUIManager.cpp:55 #10 0x0000555555acd38b in in_dispatch_event (ev=ev@entry=0x7fffffffdaf0) at ../../../source/lib/input.cpp:63 #11 0x00005555556046b1 in PumpEvents () at ../../../source/main.cpp:259 #12 Frame () at ../../../source/main.cpp:399 #13 RunGameOrAtlas (argc=<optimized out>, argv=<optimized out>) at ../../../source/main.cpp:691 #14 0x00005555555f1050 in main (argc=1, argv=0x7fffffffdd98) at ../../../source/main.cpp:743 (gdb)
comment:19 by , 3 years ago
Got another crash, the backtrace is the same except some addresses change like: 0x55555b2bd798 => 0x55557139b2c8
Not useful right?
The less terrible test plan I have for now is
- load a game against the AI
- kill all your units a buildings
- continue watching the game in 20x speed for at least 30secs
- go back to menu
- prepare another game
- click randomly on player names, civs, team and the above reset buttons
- be ready to do that for up to 5 min!!!
- crash
I hope most of this is totally useless but I still haven't managed to reproduce without that. In total I reproduced 6 time with a total accumulated time of around one hours T_T
comment:21 by , 3 years ago
I'm sure not. It's total chaos in the system when that happens. It can't be unnoticed. Let alone 6 times.
It happened during one of my builds though...
It's a nightmare to reproduce, I'm likely totaling more than 2000 clicks since the last crash without having reproduced again T_T
What a luck I had the first time.
comment:22 by , 3 years ago
Maybe you should try in debug mode then. If we're doing something stupid it migh notice since it has more checks. Do note however that it will be extremly slow. But if it's a gui crash we might be able to pinpoint it
comment:23 by , 3 years ago
What do you mean debug mode? I'm compiling with make config=debug
and my executable changed to pyrogenesis_dbg
.
(and also with debug symbols)
follow-up: 26 comment:24 by , 3 years ago
My bad, didn't read. I didn't manage to reproduce on windows, but it might be a linux only crash.
Maybe related, https://trac.wildfiregames.com/ticket/5598
comment:25 by , 3 years ago
I managed to reproduce a few more time by just starting the game, solo => match. And using an autoclicker (xautoclick, because it was starting to hurt!) and messing with player, civ and team and above reset buttons. For still more than several minutes.
comment:26 by , 3 years ago
Replying to stanislas69:
Maybe related, https://trac.wildfiregames.com/ticket/5598
You might have magically changed the situation ^o^. Now I'm can immediately crash the game by clicking the color dropdown.
whereas it was one of the things that I flooded during my trials. Now it works immediately... It's kind of worrying that it can change like that.
Difference with #5598 is that in this case it's a SIGSEGV.
Also I clearly remember that all of the successes triggering the bug where I remebered where is the last click aren't with the color dropdown.
I recall that the color dropdown almost never worked. (I didn't know it was a color dropdown :o) But it didn't crash the game whether it worked on not.
Anyway, I can now reliably (maybe the situation will change...) reproduce a crash with the same backtrace so I guess for now, no need to fight with the UI.
In the last attempts I literally threw 30000 clicks trying to find a single place that is enough to crash the game. Like only flooding civs, or players or teams.
comment:27 by , 3 years ago
The offending lines is the last in this particular block:
m_ElementHighlight = m_Selected; // Start at the position of the selected item, if possible. GetScrollBar(0).SetPos(m_ItemsYPositions.empty() ? 0 : m_ItemsYPositions[m_ElementHighlight] - 60);
Which would mean that m_Selected == m_ElementHighlight is more than the size of m_ItemsYPositions, but the latter isn't empty. The only options I see is that either m_ItemsYPosition is not the correct size, or that m_Selected is an incorrect value, such as "-1".
By checking, I can easily verify that the color dropdown is indeed initially -1, ergo the memory access is undefined and could crash.
Funnily, a similar crash was fixed in Phab:rP13936
I'll upload a patch shortly for the C++ bug and the JS issue that caused the color dropdown to have the wrong preselected color.
comment:28 by , 3 years ago
Patch: | → Phab:D4183 |
---|
comment:29 by , 3 years ago
Owner: | set to |
---|
Another one when starting a game. 22MiB coredump this time.
stdout:
stacktrace