Opened 3 years ago

Closed 3 years ago

Last modified 2 years ago

#6028 closed defect (fixed)

Assert error in topology.cpp

Reported by: nwtour Owned by: wraitii
Priority: Should Have Milestone: Alpha 25
Component: Core engine Keywords:
Cc: Patch: Phab:D3575

Description

git version of 0 A.D. Start pyrogenesis on linux/gcc/i386 asser error:

Sound: AlcInit success, using OpenAL Soft
topology.cpp(105): Assertion failed: "logicalPerPackage % maxCoresPerPackage == 0"
Assertion failed: "logicalPerPackage % maxCoresPerPackage == 0"
Location: topology.cpp:105 (MaxLogicalPerCore)

Call stack:

(0x8585735) ./binaries/system/pyrogenesis() [0x8585735]
(0x85363bd) ./binaries/system/pyrogenesis() [0x85363bd]
(0x853780b) ./binaries/system/pyrogenesis() [0x853780b]
(0x8537f6a) ./binaries/system/pyrogenesis() [0x8537f6a]
(0x857f84b) ./binaries/system/pyrogenesis() [0x857f84b]
(0x85b26a2) ./binaries/system/pyrogenesis() [0x85b26a2]
(0x857f970) ./binaries/system/pyrogenesis() [0x857f970]
(0x82bce90) ./binaries/system/pyrogenesis() [0x82bce90]
(0x82b53c7) ./binaries/system/pyrogenesis() [0x82b53c7]
(0x80c7758) ./binaries/system/pyrogenesis() [0x80c7758]
(0x80b4686) ./binaries/system/pyrogenesis() [0x80b4686]
(0xb5f8eb41) /lib/libc.so.6(__libc_start_main+0xf1) [0xb5f8eb41]
(0x80c47ec) ./binaries/system/pyrogenesis() [0x80c47ec]

errno = 0 (Error during IO)
OS error = ?

if press "Continue" in xmessage dialog game work successfull.

Invesigation: problem after patch from https://trac.wildfiregames.com/ticket/2675

in assembler function __cpuidex (source/lib/sysdep/arch/x86_x64/x86_x64.cpp)

Run in GDB:
break source/lib/sysdep/arch/x86_x64/topology.cpp:98
break source/lib/sysdep/arch/x86_x64/topology.cpp:105
run
watch regs

Problem (master trunk) version:
Old value = {eax = 142073856, ebx = 3221215440, ecx = 3221215480, edx = 2980253952}
New value = {eax = 1, ebx = 0, ecx = 0, edx = 0}
topology::MaxLogicalPerCore () at ../../../source/lib/sysdep/arch/x86_x64/topology.cpp:100

Old value = {eax = 1, ebx = 0, ecx = 0, edx = 0}
New value = {eax = 1, ebx = 142073856, ecx = 0, edx = 0}
0x08580a5d in x86_x64::Invoke_cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:97

Old value = {eax = 1, ebx = 142073856, ecx = 0, edx = 0}
New value = {eax = 1, ebx = 142073856, ecx = 201384893, edx = 0}
0x08580a63 in x86_x64::Invoke_cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:97

Old value = {eax = 1, ebx = 142073856, ecx = 201384893, edx = 0}
New value = {eax = 1, ebx = 142073856, ecx = 201384893, edx = 3219913727}
x86_x64::Invoke_cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:97

Old value = {eax = 1, ebx = 142073856, ecx = 201384893, edx = 3219913727}
New value = {eax = 67194, ebx = 142073856, ecx = 201384893, edx = 3219913727}
0x08580a68 in x86_x64::cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:97


Logic without "save-restore ebx hack":
Old value = {eax = 142073856, ebx = 3221215440, ecx = 3221215480, edx = 2983106304}
New value = {eax = 1, ebx = 0, ecx = 0, edx = 0}
topology::MaxLogicalPerCore () at ../../../source/lib/sysdep/arch/x86_x64/topology.cpp:100

Old value = {eax = 1, ebx = 0, ecx = 0, edx = 0}
New value = {eax = 67194, ebx = 0, ecx = 0, edx = 0}
0x08580b0d in x86_x64::Invoke_cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:97

Old value = {eax = 67194, ebx = 0, ecx = 0, edx = 0}
New value = {eax = 67194, ebx = 133120, ecx = 0, edx = 0}
0x08580b10 in x86_x64::Invoke_cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:97

Old value = {eax = 67194, ebx = 133120, ecx = 0, edx = 0}
New value = {eax = 67194, ebx = 133120, ecx = 201384893, edx = 0}
0x08580b13 in x86_x64::Invoke_cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:97

Old value = {eax = 67194, ebx = 133120, ecx = 201384893, edx = 0}
New value = {eax = 67194, ebx = 133120, ecx = 201384893, edx = 3219913727}
x86_x64::Invoke_cpuid (regs=0xbfffd860) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:99

-----------------------------------------------
register ebx = 133120 - true information about my CPU (2 logical cores)
register ebx = 142073856 - previous value of ebx. logicalPerPackage interprets as 119 cores in CPU. mistakenly falls into regs.ebx (current git code)

code:
pushl %ebx
cpuid
mov %ebx,%1
popl %ebx
does not work on my system as the author intended


Linux kernel 4.19.72
gcc version 8.4.1
$ LANG=C lscpu
Architecture:        i686
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       36 bits physical, 48 bits virtual
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               23
Model name:          Intel(R) Celeron(R) CPU        E3300  @ 2.50GHz
Stepping:            10
CPU MHz:             1513.041
CPU max MHz:         2500.0000
CPU min MHz:         1203.0000
BogoMIPS:            4859.19
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm

Attachments (3)

system_info.txt (3.8 KB ) - added by nwtour 3 years ago.
lshw.txt (15.3 KB ) - added by nwtour 3 years ago.
cpuid.txt (34.7 KB ) - added by nwtour 3 years ago.

Download all attachments as: .zip

Change History (15)

comment:1 by Stan, 3 years ago

Can you attach your system_info.txt as well? Does it happen with A23B too? I'm currently trying to debug something on I386.

by nwtour, 3 years ago

Attachment: system_info.txt added

comment:2 by Stan, 3 years ago

Can you compile with -msse2 ? eg.

sh update-workspaces.sh CXXFLAGS="-msse2" -j2

and

make CXXFLAGS="-msse2" -j2

EDIT: would be great if you could come on IRC :)

Last edited 3 years ago by Stan (previous) (diff)

comment:3 by Imarok, 3 years ago

Could you run cpuid and lshw and give us the output? Seems like you already did that xD

According to the lscpu output your cpu has hyper-threading (ht flag). But intel claims otherwise: https://ark.intel.com/content/www/us/en/ark/products/42771/intel-celeron-processor-e3300-1m-cache-2-50-ghz-800-mhz-fsb.html So I guess there is some issue with your CPU and not mainly with 0ad.

Last edited 3 years ago by Imarok (previous) (diff)

comment:4 by Stan, 3 years ago

The last part of the description is lscpu.

by nwtour, 3 years ago

Attachment: lshw.txt added

by nwtour, 3 years ago

Attachment: cpuid.txt added

comment:5 by Stan, 3 years ago

Right it seems your CPU pretends to support hyperthreading which set us off...

comment:6 by nwtour, 3 years ago

Can you compile with -msse2 ?

pyrogenesys recompilation done... assert with -msse2 does not appear!

>So I guess there is some issue with your CPU

Yes. If force set in source code CAP_HT to 0 - assert does not appear.

comment:7 by Stan, 3 years ago

Interesting might be something not compiling properly if it works with msse2. Maybe we hit something else.

comment:8 by nwtour, 3 years ago

In mmse2 version regs equivalent non mmse2 version:

Old value = {eax = 905664000, ebx = 0, ecx = 0, edx = 0}
New value = {eax = 1, ebx = 0, ecx = 0, edx = 0}
topology::MaxLogicalPerCore () at ../../../source/lib/sysdep/arch/x86_x64/topology.cpp:100

Old value = {eax = 1, ebx = 0, ecx = 0, edx = 0}
New value = {eax = 1, ebx = 142249984, ecx = 0, edx = 0}
0x085a42ed in x86_x64::Invoke_cpuid (regs=0xbfffd8ac) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:119

Old value = {eax = 1, ebx = 142249984, ecx = 0, edx = 0}
New value = {eax = 1, ebx = 142249984, ecx = 201384893, edx = 0}
0x085a42f0 in x86_x64::Invoke_cpuid (regs=0xbfffd8ac) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:119

Old value = {eax = 1, ebx = 142249984, ecx = 201384893, edx = 0}
New value = {eax = 1, ebx = 142249984, ecx = 201384893, edx = 3219913727}
x86_x64::Invoke_cpuid (regs=<optimized out>) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:120

Old value = {eax = 1, ebx = 142249984, ecx = 201384893, edx = 3219913727}
New value = {eax = 67194, ebx = 142249984, ecx = 201384893, edx = 3219913727}
0x085a42f5 in x86_x64::cpuid (regs=0xbfffd8ac) at ../../../source/lib/sysdep/arch/x86_x64/x86_x64.cpp:120

But function bits(regs.ebx, 16, 23) return 122 and assert (logicalPerPackage % maxCoresPerPackage == 0) skipped

comment:9 by Imarok, 3 years ago

Patch: Phab:D3575

comment:10 by wraitii, 3 years ago

Owner: set to wraitii
Resolution: fixed
Status: newclosed

In 25405:

Fix cpuid call with -fPIC on i386 architectures.

GCC < 5 used to reserve the ebx register for PIC (position-independent code) metadata. This meant that we needed to save the state of ebx before calling cpuid (fixed in #2675)
However, the original patch from r15886 did not force a particular register to store this value in. Following the GCC 5 upgrade, GCC stopped reserving ebx, and that register silently got used instead. The code became non-sensical, and our ASM cpuidex started returning random garbage in edx.

Since we now only support GCC7 and above, the PIC-specific branch is no longer necessary and is removed.

Fixes #6028. The assertion was a result of random data in ebx.
Refs #2675 / reverts rP15890 and reverts rP15886.

Patch by: nwtour

Comments by: vladislavbelov

Differential Revision: https://code.wildfiregames.com/D3575

comment:11 by Stan, 3 years ago

Milestone: BacklogAlpha 25

comment:12 by Stan, 2 years ago

In 26157:

Remove topology.cpp. The data isn't useful to us, and it prevents some players from running the game.

Fixes #6028
Differential Revision: https://code.wildfiregames.com/D4402

Note: See TracTickets for help on using tickets.