Opened 10 years ago

Closed 10 years ago

#2426 closed defect (fixed)

[PATCH] Resolve always failing TestNetComms::test_basic_DISABLED()

Reported by: Echelon9 Owned by: leper
Priority: Nice to Have Milestone: Alpha 17
Component: Core engine Keywords: patch
Cc: Patch:

Description (last modified by Echelon9)

Whilst this test must be manually enabled (not in the default set) at present it always fails due to a segmentation fault. This has been seen consistently on Windows and Mac OS X Mavericks 10.9.

What happens:

$ ./test -test TestNetComms
Running 2 tests<!DOCTYPE html>
<meta charset="utf-8">
<title>Pyrogenesis Log</title>
<style>body { background: #eee; color: black; font-family: sans-serif; } p { background: white; margin: 3px 0 3px 0; } .error { color: red; } .warning { color: blue; }</style>
<h2>0 A.D. Main log</h2>
Segmentation fault: 11

Crash report:

Process:         test [5392]
Path:            /Users/USER/Documents/*/test
Identifier:      test
Version:         0
Code Type:       X86-64 (Native)
Parent Process:  bash [24440]
Responsible:     Terminal [272]
User ID:         501

Date/Time:       2014-02-06 23:07:42.150 +1100
OS Version:      Mac OS X 10.9.1 (13B42)
Report Version:  11
Anonymous UUID:  129AEE89-1F43-FCAF-CF2C-B538E4E1C5EE

Sleep/Wake UUID: 48D3CBD2-E547-4BF5-B4EA-C353A614D69D

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000000

VM Regions Near 0:
--> 
    __TEXT                 00000001030d2000-0000000103d22000 [ 12.3M] r-x/rwx SM=COW  /Users/USER/Documents/*

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   test                          	0x000000010328ec5d ScriptInterface_impl::ScriptInterface_impl(char const*, boost::shared_ptr<ScriptRuntime> const&) + 93 (ScriptInterface.cpp:482)
1   test                          	0x000000010328ff86 ScriptInterface::ScriptInterface(char const*, char const*, boost::shared_ptr<ScriptRuntime> const&) + 70 (memory:1901)
2   test                          	0x000000010328ff29 ScriptInterface::ScriptInterface(char const*, char const*, boost::shared_ptr<ScriptRuntime> const&) + 25 (ScriptInterface.cpp:587)
3   test                          	0x0000000103270f74 CComponentManager::CComponentManager(CSimContext&, boost::shared_ptr<ScriptRuntime>, bool) + 68 (ComponentManager.cpp:59)
4   test                          	0x0000000103270f0c CComponentManager::CComponentManager(CSimContext&, boost::shared_ptr<ScriptRuntime>, bool) + 28 (ComponentManager.cpp:106)
5   test                          	0x00000001031e7074 CSimulation2Impl::CSimulation2Impl(CUnitManager*, boost::shared_ptr<ScriptRuntime>, CTerrain*) + 100 (shared_count.hpp:305)
6   test                          	0x00000001031e338a CSimulation2::CSimulation2(CUnitManager*, boost::shared_ptr<ScriptRuntime>, CTerrain*) + 106 (shared_count.hpp:305)
7   test                          	0x00000001031e3309 CSimulation2::CSimulation2(CUnitManager*, boost::shared_ptr<ScriptRuntime>, CTerrain*) + 25 (Simulation2.cpp:571)
8   test                          	0x00000001032bb4ac CGame::CGame(bool) + 140 (shared_count.hpp:305)
9   test                          	0x00000001032bb3fd CGame::CGame(bool) + 29 (Game.cpp:86)
10  test                          	0x0000000103131427 TestNetComms::test_basic_DISABLED() + 199 (test_Net.h:145)
11  test                          	0x0000000103131340 TestDescription_TestNetComms_test_basic_DISABLED::runTest() + 32 (test_Net.cpp:24)
12  test                          	0x00000001030d60d9 CxxTest::RealTestDescription::run() + 41 (RealDescriptions.cpp:96)
13  test                          	0x00000001030dbb88 CxxTest::TestRunner::runSuite(CxxTest::SuiteDescription&) + 856 (TestRunner.h:77)
14  test                          	0x00000001030db666 CxxTest::PsTestRunner::runWorld() + 1046 (PsTestWrapper.h:78)
15  test                          	0x00000001030daf60 CxxTest::PsTestRunner::runAllTests(CxxTest::TestListener&) + 384 (TestTracker.cpp:24)
16  test                          	0x00000001030da692 CxxTest::GuiTuiRunner<CxxTest::PsTestWrapper, CxxTest::ErrorPrinter>::run() + 82 (PsTestWrapper.h:96)
17  test                          	0x00000001030d3dc8 main + 200 (test_root.cpp:17)
18  libdyld.dylib                 	0x00007fff97ef25fd start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000000  rbx: 0x00007fa4b8d280d0  rcx: 0x00007fa4b8d28100  rdx: 0x00007fff5cb2cd80
  rdi: 0x00007fa4b8d280d0  rsi: 0x0000000103c89b8d  rbp: 0x00007fff5cb2c710  rsp: 0x00007fff5cb2c6e0
   r8: 0x0000000000000003   r9: 0x00007fa4b8d00000  r10: 0x000000000d6fff4c  r11: 0x00000000bc127ecb
  r12: 0x00007fff7c4c6400  r13: 0x00007fa4b8d27ef8  r14: 0x00007fa4b8d280f8  r15: 0x0000000103c89b8d
  rip: 0x000000010328ec5d  rfl: 0x0000000000010246  cr2: 0x0000000000000000
  
Logical CPU:     0
Error Code:      0x00000004
Trap Number:     14

Attachments (2)

fix-trac-2426.patch (1.7 KB ) - added by Echelon9 10 years ago.
Resolution patch for segmentation fault
fix-trac-2426_v2.patch (1.7 KB ) - added by Echelon9 10 years ago.
Revised patch (map file names have subsequently changed)

Download all attachments as: .zip

Change History (13)

comment:1 by historic_bruno, 10 years ago

It fails the same way on Windows.

comment:2 by historic_bruno, 10 years ago

Milestone: Alpha 16Backlog

comment:3 by Echelon9, 10 years ago

Description: modified (diff)

comment:4 by Echelon9, 10 years ago

Debugging with LLDB reports that the crash occurs in ScriptInterface_impl::ScriptInterface_impl(), when a NULL JSRuntime object is passed.

$ lldb -- test -test TestNetComms
Current executable set to 'test' (x86_64).
(lldb) r
Process 90140 launched: '/Users/User/Documents/Coding/0ad/trunk/binaries/system/test' (x86_64)
Running 2 tests<!DOCTYPE html>
<meta charset="utf-8">
<title>Pyrogenesis Log</title>
<style>body { background: #eee; color: black; font-family: sans-serif; } p { background: white; margin: 3px 0 3px 0; } .error { color: red; } .warning { color: blue; }</style>
<h2>0 A.D. Main log</h2>
Process 90140 stopped
* thread #1: tid = 0x4037c, 0x00000001001bcded test`ScriptInterface_impl::ScriptInterface_impl(this=0x00000001013180c0, nativeScopeName=0x0000000100bb9b3d, runtime=0x00007fff5fbfed40) + 93 at ScriptInterface.cpp:484, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001001bcded test`ScriptInterface_impl::ScriptInterface_impl(this=0x00000001013180c0, nativeScopeName=0x0000000100bb9b3d, runtime=0x00007fff5fbfed40) + 93 at ScriptInterface.cpp:484
   481 	    
   482 	    // ENSURE(m_runtime);
   483 	
-> 484 		m_cx = JS_NewContext(m_runtime->m_rt, STACK_CHUNK_SIZE);
   485 		ENSURE(m_cx);
   486 	
   487 		// For GC debugging:
(lldb) p runtime
(const boost::shared_ptr<ScriptRuntime>) $0 = {
  px = 0x0000000000000000
  pn = {
    pi_ = 0x0000000000000000
  }
}

However, it appears something odd is happening in the constructors or initialisation routines, as at that point the global script JSRuntime pointer may also be NULL.

(lldb) p g_ScriptRuntime
(boost::shared_ptr<ScriptRuntime>) $2 = {
  px = 0x0000000000000000
  pn = {
    pi_ = 0x0000000000000000
  }
}

Backtrace below.

(lldb) bt -c 20
* thread #1: tid = 0x4037c, 0x00000001001bcded test`ScriptInterface_impl::ScriptInterface_impl(this=0x00000001013180c0, nativeScopeName=0x0000000100bb9b3d, runtime=0x00007fff5fbfed40) + 93 at ScriptInterface.cpp:484, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001001bcded test`ScriptInterface_impl::ScriptInterface_impl(this=0x00000001013180c0, nativeScopeName=0x0000000100bb9b3d, runtime=0x00007fff5fbfed40) + 93 at ScriptInterface.cpp:484
    frame #1: 0x00000001001be116 test`ScriptInterface::ScriptInterface(this=0x0000000101318948, nativeScopeName=<unavailable>, debugName=0x0000000100bcceb2, runtime=<unavailable>) + 70 at ScriptInterface.cpp:569
    frame #2: 0x00000001001be0b9 test`ScriptInterface::ScriptInterface(this=<unavailable>, nativeScopeName=<unavailable>, debugName=<unavailable>, runtime=<unavailable>) + 25 at ScriptInterface.cpp:591
    frame #3: 0x000000010019f104 test`CComponentManager::CComponentManager(this=0x0000000101318948, context=0x0000000101318920, rt=<unavailable>, skipScriptFunctions=false) + 68 at ComponentManager.cpp:59
    frame #4: 0x000000010019f09c test`CComponentManager::CComponentManager(this=<unavailable>, context=<unavailable>, rt=<unavailable>, skipScriptFunctions=<unavailable>) + 28 at ComponentManager.cpp:106
    frame #5: 0x0000000100115354 test`CSimulation2Impl::CSimulation2Impl(this=0x0000000101318920, unitManager=0x0000000101317790, rt=<unavailable>, terrain=0x0000000101317740) + 100 at Simulation2.cpp:65
    frame #6: 0x000000010011166a test`CSimulation2::CSimulation2(CUnitManager*, boost::shared_ptr<ScriptRuntime>, CTerrain*) [inlined] boost::shared_ptr<ScriptRuntime>::shared_ptr(this=0x0000000000000000, =<unavailable>) + 106 at shared_ptr.hpp:164
    frame #7: 0x0000000100111639 test`CSimulation2::CSimulation2(CUnitManager*, boost::shared_ptr<ScriptRuntime>, CTerrain*) [inlined] boost::shared_ptr<ScriptRuntime>::shared_ptr(rt=0x0000000000000000, this=0x0000000000000000, this=0x0000000000000000, =<unavailable>, unitManager=<unavailable>, terrain=<unavailable>) at shared_ptr.hpp:164
    frame #8: 0x0000000100111639 test`CSimulation2::CSimulation2(this=0x0000000101317c70, unitManager=<unavailable>, rt=<unavailable>, terrain=<unavailable>) + 57 at Simulation2.cpp:570
    frame #9: 0x00000001001115e9 test`CSimulation2::CSimulation2(this=<unavailable>, unitManager=<unavailable>, rt=<unavailable>, terrain=<unavailable>) + 25 at Simulation2.cpp:571
    frame #10: 0x00000001001ea46c test`CGame::CGame(this=0x00007fff5fbff378, disableGraphics=true) + 140 at Game.cpp:67
    frame #11: 0x00000001001ea3bd test`CGame::CGame(this=<unavailable>, disableGraphics=<unavailable>) + 29 at Game.cpp:86
    frame #12: 0x000000010005f707 test`TestNetComms::test_basic_DISABLED(this=0x0000000100cbf5e8) + 199 at test_Net.h:144
    frame #13: 0x000000010005f620 test`TestDescription_TestNetComms_test_basic_DISABLED::runTest(this=<unavailable>) + 32 at test_Net.cpp:24
    frame #14: 0x00000001000043b9 test`CxxTest::RealTestDescription::run(this=<unavailable>) + 41 at RealDescriptions.cpp:96
    frame #15: 0x0000000100009e68 test`CxxTest::TestRunner::runSuite(CxxTest::SuiteDescription&) [inlined] CxxTest::TestRunner::runTest(td=<unavailable>) + 373 at TestRunner.h:76
    frame #16: 0x0000000100009cf3 test`CxxTest::TestRunner::runSuite(this=<unavailable>, sd=0x0000000100cb6e40) + 483 at TestRunner.h:63
    frame #17: 0x0000000100009946 test`CxxTest::PsTestRunner::runWorld(this=0x00007fff5fbff648) + 1046 at PsTestWrapper.h:80
    frame #18: 0x0000000100009240 test`CxxTest::PsTestRunner::runAllTests(listener=<unavailable>) + 384 at PsTestWrapper.h:35
    frame #19: 0x0000000100008972 test`CxxTest::GuiTuiRunner<CxxTest::PsTestWrapper, CxxTest::ErrorPrinter>::run() [inlined] CxxTest::PsTestWrapper::runGui(argc=0x00007fff5fbff734, argv=0x00007fff5fbff780, listener=0x00007fff5fbff6c8, this=0x00007fff5fbff6f8) + 23 at PsTestWrapper.h:95
(lldb)
Last edited 10 years ago by Echelon9 (previous) (diff)

comment:5 by Echelon9, 10 years ago

Resolution patch for the underlying segmentation fault attached. This crash was because the global g_ScriptRuntime was not set, which is utilised within the CGame and CComponenetManager constructors.

A secondary issue remains that the test fails due to absence of data on disk. This will be resolved separately.

comment:6 by Echelon9, 10 years ago

Keywords: review patch added
Milestone: BacklogAlpha 16
Summary: Resolve always failing TestNetComms::test_basic_DISABLED()[PATCH] Resolve always failing TestNetComms::test_basic_DISABLED()

by Echelon9, 10 years ago

Attachment: fix-trac-2426.patch added

Resolution patch for segmentation fault

by Echelon9, 10 years ago

Attachment: fix-trac-2426_v2.patch added

Revised patch (map file names have subsequently changed)

comment:7 by sanderd17, 10 years ago

Why do we have tests that aren't in the main set?

comment:8 by Echelon9, 10 years ago

I don't have the full history, but there are some which take significantly longer to run. There are also a few, particularly related to networking, which appear to have been set aside from the main set as they were broken.

Of course, the correct solution if that is the history is to fix the tests and improve the code coverage. Hence the patch here. If it can be resolved, I would subsequently recommend promoting this network test to the main group.

AFAIK, this is the reason. Other's with more of the 0ad history might know otherwise.

comment:9 by Yves, 10 years ago

Keywords: review removed

Thanks for having a look at this issue. Having a good test coverage of our code is important, especially if we want to use a continuous integration system with tools such as Jenkins (which I am testing at the moment). Sorry for not replying earlier, I was quite busy with the localization topics recently.

Currently running specific test suites with "cxxtest -test" is broken. That's probably a bug I introduced when upgrading cxxtest to get the new Jenkins functionality (xUnit test format). I've created a ticket for that (#2488) which should be solved before we can take care of this specific test. I've also committed r14995 yesterday which will require some minor changes for this patch as well.

I'm removing the review keyword for these reasons.

comment:10 by Josh, 10 years ago

Milestone: Alpha 16Alpha 17

As #2488 will not be completed by A16, I'm moving this to A17.

comment:11 by leper, 10 years ago

Owner: set to leper
Resolution: fixed
Status: newclosed

In 15672:

Fix failure in TestNetComms::test_basic_DISABLED. Patch by Echelon9. Fixes #2426.

Note: See TracTickets for help on using tickets.