Changes between Version 5 and Version 6 of Debugging


Ignore:
Timestamp:
Dec 7, 2013, 12:08:35 PM (10 years ago)
Author:
Yves
Comment:

Adds additional tips for OOS debgging and some known causes

Legend:

Unmodified
Added
Removed
Modified
  • Debugging

    v5 v6  
    8888However, it may be that you won't see any diff, or maybe it will be huge and affect many entities and components. If there's no diff, that means the simulation state differed, but the difference doesn't affect the debug serializer. There are a few reasons why this could happen, but most likely the [wiki:JSON JSON] representation of the dump doesn't allow the actual value or (for a JavaScript value) the difference is in SpiderMonkey's internal representation of the data. This has been reported before with e.g. `NaN`, having a single bit difference depending on the JIT behavior (see #1879).
    8989
     90=== OOS caused by the AI ===
     91An out of sync error is caused by the AI if it sends different commands to the simulation on different clients or even on the same client when running the game multiple times. You can add some debugging code to binaries/data/mods/public/simulation/helpers/Commands.js in the function ProcessCommand to check if this is happening. When using a replay you know which player number is the AI player and can filter the output like that:
     92{{{
     93        if (player == 3 || player == 4)
     94                warn(uneval(cmd));
     95}}}
     96
     97Now you can add more debugging output in the part of the AI that is responsible for sending the differing commands and try to figure out why it's happening.
     98If you add the same debugging output on multiple machines, you can use diff to easily find the place in the log where the first difference is found.
     99
    90100=== Known causes of OOS ===
    91101
    92 The following are known, reproducible causes of OOS errors:
     102The following are known, reproducible causes of OOS errors that currently aren't solved:
    93103* Rejoining a multiplayer game with AIs - because the AIs don't fully serialize their state, the rejoining player's state will differ and cause an OOS, see #1089.
    94 * Multiplayer games with Aegis AI, see #2000.
     104
     105
     106The following are possible sources of OOS problems we found in the past:
     107* Spidermonkey has some JIT compiler issues (see #2000 for example). If you want to rule out the JIT compiler, you can disable it by uncommenting the lines in ScriptInterface.cpp that set the options JSOPTION_JIT and JSOPTION_METHODJIT. If it is the JIT compiler, you should still try to find the exact location of the problem because disabling JIT compiling completely is bad for performance.
     108* Uninitialized variables can cause different behaviour because their value depends on random memory content. Valgrind helps detecting these errors (more detailed explanation needed).
     109* Data affecting the simulation is kept past the runtime of one game (see #2285 for example). In this case OOS errors typically occur if one or more players have previously played another game without shutting down the engine afterwards. It's difficult to troubleshoot with replays because they always start a fresh instance of the engine.
     110* Floating point operations can differ slightly on different machines and different architectures.
     111
    95112
    96113=== Serialization test mode ===