Opened 6 years ago

Closed 2 years ago

#4847 closed defect (worksforme)

RunHardwareDetection numa_NumNodes Crash

Reported by: Arnoend Owned by:
Priority: Should Have Milestone: Alpha 26
Component: Core engine Keywords:
Cc: LukeV1 Patch:

Description

If i want to join it says something with port and always when i want to start the game a programm error comes and i click surpress

Attachments (1)

crashlog.txt (507 bytes ) - added by Arnoend 6 years ago.
Crahlog new

Download all attachments as: .zip

Change History (33)

comment:1 by Imarok, 6 years ago

When does it happen? To whom you tried to connect? Was it inside the lobby or outside? What did the program error said? Any Logfiles? (crashdump, interestinglog, mainlog, etc. see https://trac.wildfiregames.com/wiki/GameDataPaths)

in reply to:  1 comment:2 by Arnoend, 6 years ago

Replying to Imarok: i send you the chrashlog it happens on 70 percent of the games

comment:3 by elexis, 6 years ago

I believe you speak of two different issues, is that right?

Not being able to join some games is because the host didn't forward the UDP port but lazily used STUN which only works for some routers.

Crashing occurs completely unrelated from the connectivity issue, right?

Again, when exactly does the crash happen? After the loading screen? In the middle of a game? (Could it be an out of memory issue due to a huge mapsize, too many units or AI players possibly?)

in reply to:  3 comment:4 by Arnoend, 6 years ago

Replying to elexis: it happens when the game starts

yes it is unrelated

yes 2 issues

but my friends can join the games

comment:5 by elexis, 6 years ago

Then your friends have different hardware: https://en.wikipedia.org/wiki/Network_address_translation#Methods_of_translation

Try increasing the stun timeout in the user config file, some people claim it helps sometimes. But it still won't work with every host and this isn't a bug in the game. The host has to configure the router, namely forward the port, if everyone should be able to join.

Any information on the crashes?

in reply to:  5 comment:6 by Arnoend, 6 years ago

how do i do this

my friends can join in the games but not me

Last edited 6 years ago by Arnoend (previous) (diff)

comment:7 by elexis, 6 years ago

Milestone: Backlog
Resolution: needsinfo
Status: newclosed

Try adding the following line to your user.cfg file wiki:GameDataPaths

lobby.stun.delay = 500

200 is the default number. Greater numbers will freeze the game a bit longer but are more likely to end in a successful connection.

But the only reliable way to join a game is telling the host to forward the port.

If you can tell me anything about the crash, we might reopen this ticket.

in reply to:  7 comment:8 by Arnoend, 6 years ago

The crash happens when i start the game i always press surpress and the game starts i already posted the crashlog

Stun delay doesent help

Last edited 6 years ago by Arnoend (previous) (diff)

comment:9 by elexis, 6 years ago

Component: Multiplayer lobbyCore engine
Milestone: Alpha 23
Priority: Release BlockerShould Have
Resolution: needsinfo
Status: closedreopened
Summary: Cannot join and Program errorRunHardwareDetection numa_NumNodes Crash G DATA Internet security blocks file read access

About the crash:

The crash textfile doesn't reveal too much, but now where I read

numa_NumNodes (wnuma.cpp:296)
RunHardwareDetection (hwdetect.cpp:313)

I recall this user who had the same issue. So let me guess, you installed G DATA Internet security? Or some other anti-virus thing thinking it's funny to block file reads randomly? :P

I figured out the problem: the installed antivirus scanner (G DATA Internet security) seems to block all file accesses made by pyrogenesis after a few dozen file changes in appdata subdirectories. So elexis is right - it actually is a directory permission issue - but not caused by windows :blink:

https://wildfiregames.com/forum/index.php?/topic/22864-game-crashes-on-startup/&page=2&tab=comments#comment-339704

So maybe you can fix it by shutting down anti virus software. We should still try to make it not crash, as we don't want any repetition of these kinds of reports with extremely hard to guess solutions.

About your connectivity issue: Then only communication with the other players helps (or a router that doesn't implement symmetrical NAT I guess). The network issue is due to using the stun protocol. We could only fix it reliably with either the apparently vulnerable TURN protocol or ranting some servers to host games for everyone. Or remove STUN completely from the lobby which means only 10% of the players can host where everyone can join, while dozens of players can't find any host to join a game. So maybe we should look into finding a way to get the TURN protocol safe and implement it. Might not happen in the next years though.

in reply to:  9 comment:10 by Arnoend, 6 years ago

And what should i do to fix it i am not a pro in this internet thinks

And also shutting down gdata doesnt help

Last edited 6 years ago by Arnoend (previous) (diff)

comment:11 by elexis, 6 years ago

The issue that some players can join games that you can't join is not solvable.

If you just want to play in any lobby game, join these games that you can join. If you have a few specific people you want to play with, you should either try to find a host all of you can join or one of you hosts a game after having figured out how to configure the router (in that case rtfm)

by Arnoend, 6 years ago

Attachment: crashlog.txt added

Crahlog new

in reply to:  11 comment:12 by Arnoend, 6 years ago

Hey now i get a new crashlog when starting the game already atached it

comment:13 by elexis, 6 years ago

Did you try the anti-virus thing?

comment:14 by Arnoend, 6 years ago

yep tryed it still cannot join matches and error comes

comment:15 by LukeV1, 6 years ago

@Arnoend:

Elexis already cited me above - I'm the other guy who had that numa_numnodes problem. A possible "soultion" is to create a path exclution for the G DATA real-time protection module (can be configured in the setttings somwhere). This works quite decent - only downside is that you've got a security-free folder ...

BTW: the error window on startup can be supressed by commenting out three lines in the source code ;)

Do your have windows 10 as well?

Last edited 6 years ago by LukeV1 (previous) (diff)

comment:16 by Arnoend, 6 years ago

I even tried deleting gdata nothing helped still cannot join matches and error comes on startup yep i have win 10

comment:17 by Stan, 6 years ago

Sorry for the delayed answer. Do you have a crashlog.dmp file in your GameDataPaths ? If so could you upload it here ?

comment:18 by elexis, 6 years ago

The most relevant discussion is on the forums, see comment:9. It's some anti-virus software that is blocking read-access when too many files are opened in a short period.

The crash should be prevented and it should print some human-readable information since this issue was reported by multiple players, the cause is very hard to guess and developers shouldn't have to repeat the investigation.

comment:19 by elexis, 6 years ago

Patch: Phab:D1274

comment:20 by Itms, 6 years ago

Milestone: Alpha 23Alpha 24

Pushing to the next development cycle.

comment:21 by LukeV1, 3 years ago

Hello again, I feel like I should clear some things up regarding this ticket, some of its content is really obfuscated at the moment.

First of all: the Title is misleading. This error is not caused by G DATA. G DATA is only responsible for #4802.

Quoting Arnoend:

I even tried deleting gdata nothing helped still cannot join matches and error comes on startup

The Linked patch is therefore also NOT applying to this problem - this ticket is more kind of a new defect.

(Just for further clarification: I indeed mentioned the error on startup in the linked forum thread, but this was just another bug I found on the way, not caused by G DATA. Not that I'm a big friend of that virus scanner, but right now the ticket leads in a wrong direction)

I'm not sure if I should change the ticket header here on my own, so I'll leave it for now, hoping some developer will change it.

For further cases: Would it be OK from developer side if I do such kind of changes or is it better this way?

Last edited 3 years ago by LukeV1 (previous) (diff)

comment:22 by LukeV1, 3 years ago

Cc: LukeV1 added

comment:23 by LukeV1, 3 years ago

Some more problem-related feedback: This startup-error in numa_NumNodes happend to me only on Ryzen CPUs. (Tested 5 different PCs, two running AMD ryzens, rest of it Intel cores). So it may be related to all that other ryzen fuzz..

comment:24 by LukeV1, 3 years ago

Patch: Phab:D1274
Summary: RunHardwareDetection numa_NumNodes Crash G DATA Internet security blocks file read accessRunHardwareDetection numa_NumNodes Crash

comment:25 by Stan, 3 years ago

Hey could you try all the Ryzen patches?

comment:26 by OptimusShepard, 3 years ago

When I hear NUMA problems, I guess you are talking about Threadripper CPUs, not the desktop Ryzen? As Stan already said, could you please check our Ryzen/Threadripper patches and give some feedback?

https://wildfiregames.com/forum/topic/28367-amd-ryzen-threadripper-user-read-before-posting/

comment:27 by Stan, 3 years ago

Milestone: Alpha 24Alpha 25

Needs more info

comment:28 by LukeV1, 3 years ago

Hello and sorry for that long delay..

No, actually "my" Ryzens aren't threadripper CPUs. The most recent one here is a Ryzen 7 2700.

Today I've tested it again with a recent svn build (r24938) and couln't get any numa errors. Although RunHardwareDetection didn't do that well (interpreted the system as 'arch_ia32' and screwed up the logical vs. physical core number) but besides of that false detection nothing bad happened.

Will test it again in a few days with a new svn build to make sure I didn't just get lucky. For now it seems as if this bug is gone :)

comment:29 by Stan, 3 years ago

Hey there any news?

comment:30 by Imarok, 3 years ago

Milestone: Alpha 25Alpha 26

comment:31 by LukeV1, 2 years ago

After a very long time I recently did a few new testruns on two Ryzen systems. The result is always the same: 0 A.D. starts without errors.

As there were changes made to all suspicious sourcefiles I stubled apon during debugging I'm now pretty sure the bug got fixed somewhere on the way.

I would therefore consider this ticket as solved.

Last edited 2 years ago by LukeV1 (previous) (diff)

comment:32 by Stan, 2 years ago

Resolution: worksforme
Status: reopenedclosed
Note: See TracTickets for help on using tickets.