#3108 closed defect (fixed)
[PATCH] A18: Instant OOS between OS X 10.8 and other OSes
Reported by: | elexis | Owned by: | historic_bruno |
---|---|---|---|
Priority: | Release Blocker | Milestone: | Alpha 19 |
Component: | Core engine | Keywords: | patch |
Cc: | Patch: |
Description
Multiple people using OSX have reported that they receive an oos on every start of the game. ysey is one of them. He is on revision 16409, which is the currently released revision for OSX. I am on revision 16411. He sent me his oos dump after we started a game. The oos_dump is identical to mine! LjL reported the same issue in #0ad on march 17'th. trompetin17 and historicbruno do not seem to experience this issue with 16409.
Attachments (2)
Change History (23)
comment:1 by , 9 years ago
follow-up: 4 comment:3 by , 9 years ago
I am able to reproduce this only on OS X 10.8. I used the -ooslog
option to dump the binary state every turn, but I see no difference there. The only difference is in the calculated hash between the clients, my thought is there is a bug affecting the serializer or MD5 hash calculation on OS X 10.8. It may be possible to continue playing the game, since further hash mismatches will be ignored.
comment:4 by , 9 years ago
Replying to historic_bruno:
It may be possible to continue playing the game, since further hash mismatches will be ignored.
True, ysey is playing happily ignoring the error. Only downside of this bug is the confusion other players have.
comment:5 by , 9 years ago
Summary: | Alpha18 broken on some OSX → A18: Instant OOS between OS X 10.8 and other OSes |
---|
by , 9 years ago
Attachment: | hasher-output.zip added |
---|
comment:6 by , 9 years ago
The raw data that is hashed differs substantially from the dumped binary state in this case. I modified the HashSerializer
to dump all processed bytes to a new file, those files are attached for analysis. They are identical up to the CCmpAIManager
state, which is apparently missing from the OS X version. They are identical again after that missing data.
Philip's idea is that it might be a difference (or bug?) in the std::streambuf
implementation. I set a breakpoint in CSerializerStreamBuf::xsputn
which is supposed to do stream write redirection, but it's not called on OS X 10.8. CCmpAIManager
has its own serializer and relies on this behavior.
comment:7 by , 9 years ago
Milestone: | Backlog → Alpha 19 |
---|
comment:8 by , 9 years ago
I've done more testing of this. It seems that streambuf::xsputn
isn't required to be called, though in practice most C++ libs will do that as an optimization when writing strings. It was added to libc++ in r172542, which explains why our code worked with libstdc++ and works with libc++ on 10.9+.
Here's a minimal test case:
#include <iostream> #include <streambuf> class OurStreamBuf : public std::streambuf { public: OurStreamBuf() { } int overflow(int ch) { std::cout << "overflow: " << char(ch) << std::endl; return ch; } std::streamsize xsputn(const char* s, std::streamsize n) { std::cout << "xsputn: " << s << std::endl; return n; } }; int main(int argc, char* argv[]) { OurStreamBuf ourbuf; std::streambuf& streambuf = ourbuf; std::ostream stream(&streambuf); stream << "This is a test\n"; return 0; }
In MSVC, GCC, clang w/ libstdc++, and clang with libc++ on 10.9+, one call to OurStreamBuf::xsputn
is made with the complete test string. On clang with libc++ on 10.8, a call to OurStreamBuf::overflow
is made for each character, but we don't implement that in the game, so it never makes it to the serializer implementation.
I think our code is technically incorrect, we need to override streambuf::overflow
as well, even though it won't be called in most C++ libs - it's the technically correct thing to do and most examples I find of deriving streambuf
are implemented that way. Single characters get passed in and if we want it buffered, we have to implement that ourselves.
comment:9 by , 9 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:11 by , 9 years ago
Keywords: | patch review added |
---|---|
Summary: | A18: Instant OOS between OS X 10.8 and other OSes → [PATCH] A18: Instant OOS between OS X 10.8 and other OSes |
I've attached a patch for review, tested on OS X and Windows.
There are two points of concern:
- I implemented a buffer to prevent a large number of calls from
streambuf::overflow
to the hash algorithm, but I'm not sure about the optimal size. In the patch it is 16 bytes, which seems to match nicely with the hash implementation, while not being too large and leading to memory fragmentation. I've opted for a fixed size array, not a vector. - The buffer must be flushed after the final serialization. A destructor doesn't seem like a good place for that, because then the hash result depends on the scope of the serializer object. I don't want to require users of the serializer to manually flush the stream either. The only solution I've found that works reliably is to set the unitbuf flag on the stream as early as possible, forcing every write to flush the buffer. Maybe this has implications on performance...
The buffer and associated streambuf functions should only be needed on old versions of libc++. To test them on other platforms, I think commenting out the overridden xsputn
is enough. Simulation state hashes should be identical and there should be no OOSes.
I added a test to catch future problems like this in the streambuf implementation.
comment:12 by , 9 years ago
The patch also has a temporary timer added to the hash calculations. I have to say that with MSVC, sometimes the buffered streambuf had better performance than using xsputn, but depending on the buffer size it could also be worse. I haven't tested performance with other C++ libs.
comment:14 by , 9 years ago
Note: m_Buffer
for the StreamBuf could be a pointer to a dynamically allocated array, that only gets created as needed by overflow()
, if that's more efficient. Some implementations do that. It could also be something like a std::vector
, but as with the current solution, performance would need to be evaluated.
comment:16 by , 9 years ago
I cannot reproduce the OOS with Maverics 10.9.5 so this is truly a Mountain Lion Problem, would be nice that it gets commited since we support up 10.7.
comment:17 by , 9 years ago
In latest SVN, there is some other OOS between OS X and Windows, so it's difficult for me to test this patch.
comment:18 by , 9 years ago
comment:19 by , 9 years ago
That will take some time as reverting tons of files and recompiling isn't quick, but I don't have much choice it seems :(
comment:21 by , 9 years ago
Keywords: | review removed |
---|
Only background music changed between 16409 and 16411, so it can't be the version difference.
trompetin17 (OSX Yosemite + revision 16409) tested with me and we got no out of sync.
LJL reported in #0ad http://irclogs.wildfiregames.com/2015-03-17-QuakeNet-%230ad.log that he has OSX 10.8.5 (Mountain Lion).