Changes between Version 19 and Version 20 of Debugging


Ignore:
Timestamp:
Jul 8, 2019, 10:41:10 AM (5 years ago)
Author:
historic_bruno
Comment:

Fix typos, outdated links, and clarify a few things

Legend:

Unmodified
Added
Removed
Modified
  • Debugging

    v19 v20  
    88
    99=== Windows ===
    10  * [http://www.microsoft.com/visualstudio Visual Studio] - the basic tool for debugging the game on Windows. Break into the debugger on a breakpoint, on a crash or assertion failure, or any other time. Visual C++ Express is free and contains similar debugging features. Can be used to analyze crash dumps and get a useful call stack.
    11  * [http://msdn.microsoft.com/en-us/windows/hardware/gg463009.aspx WinDbg] - part of the Windows SDK, a very powerful debugging suite which is primarily command line driven, unlike Visual Studio. Analyze crashes in more detail than VS.
    12  * [http://technet.microsoft.com/en-us/sysinternals/bb896647.aspx DebugView] - If you don't run the process in a debugger, !DebugView lets you view its normally hidden debug output. Users can install and run this much more easily than a full debugging suite.
    13  * [http://technet.microsoft.com/en-us/sysinternals/dd535533.aspx VMMap] - Free tool from Microsoft to analyze the virtual memory usage of a process; shows fragmentation, can be useful for observing memory leaks or finding why a large allocation fails.
    14  * [http://www.gremedy.com/ gDEBugger] - Debug and profile OpenGL applications. Useful for debugging GL errors and finding unexpected behavior.
     10 * [https://visualstudio.microsoft.com/ Visual Studio] - the basic tool for debugging the game on Windows. Break into the debugger on a breakpoint, on a crash or assertion failure, or any other time. Visual Studio Community Edition is free and contains similar debugging features. Can be used to analyze crash dumps and get a useful call stack.
     11 * [https://developer.microsoft.com/en-us/windows/downloads/sdk-archive WinDbg] - part of the Windows SDK, a very powerful debugging suite which is primarily command line driven, unlike Visual Studio. Analyze crashes in more detail than VS.
     12 * [https://docs.microsoft.com/en-us/sysinternals/downloads/debugview DebugView] - If you don't run the process in a debugger, !DebugView lets you view its normally hidden debug output. Users can install and run this much more easily than a full debugging suite.
     13 * [https://docs.microsoft.com/en-us/sysinternals/downloads/vmmap VMMap] - Free tool from Microsoft to analyze the virtual memory usage of a process; shows fragmentation, can be useful for observing memory leaks or finding why a large allocation fails.
     14 * [https://www.khronos.org/opengl/wiki/Debugging_Tools OpenGL Debugger] - Debug and profile OpenGL applications. Useful for debugging GL errors and finding unexpected behavior.
    1515 * [http://notepad-plus-plus.org/ Notepad++] - small, simple, powerful text editor. You need a decent text editor on Windows.
    16  * A hex editor, like [http://home.gna.org/bless/ Bless] - useful for examining binary simulation state dumps, either for saved games or serialization errors.
     16 * A hex editor, like [https://github.com/afrantzis/bless Bless] - useful for examining binary simulation state dumps, either for saved games or serialization errors.
     17 * [https://www.virtualbox.org/ VirtualBox] - virtual machines are very convenient for testing the game on other operating systems, or for testing multiplayer games locally. VBox is FOSS and works quite well.
    1718
    1819TODO: I know there are more...
     
    2122 * [https://www.gnu.org/software/gdb/ gdb] - the basic tool for debugging in the GNU toolchain. gdb lets you e.g. break into the engine on a breakpoint or when a seg fault or assertion failure occurs. You can start your process in gdb or attach gdb to a running process.
    2223 * [http://valgrind.org/ valgrind] - debugging and profiling suite. Find memory leaks, invalid memory accesses and more.
    23  * gDEBugger
    24  * A hex editor
     24 * [https://www.khronos.org/opengl/wiki/Debugging_Tools OpenGL Debugger] - Debug and profile OpenGL applications. Useful for debugging GL errors and finding unexpected behavior.
     25 * A hex editor - useful for examining binary simulation state dumps, either for saved games or serialization errors.
     26 * [https://www.virtualbox.org/ VirtualBox] - virtual machines are very convenient for testing the game on other operating systems, or for testing multiplayer games locally. VBox is FOSS and works quite well.
    2527
    26 === OS X ===
    27  * [https://developer.apple.com/xcode/ Xcode] - free IDE for development on OS X, also has suite of debugging tools.
     28=== macOS / OS X ===
     29 * [https://developer.apple.com/xcode/ Xcode] - free IDE for development on macOS, also has suite of debugging tools.
    2830 * gdb
    29  * [http://lldb.llvm.org/ lldb] - part of the LLVM project, replacement for gdb in recent Xcode. LLDB equivalents for GDB commands can be found [http://lldb.llvm.org/lldb-gdb.html here].
     31 * [http://lldb.llvm.org/ lldb] - part of the LLVM project, replacement for gdb in recent Xcode. LLDB equivalents for GDB commands can be found [https://lldb.llvm.org/use/map.html here].
     32 * [https://www.virtualbox.org/ VirtualBox] - virtual machines are very convenient for testing the game on other operating systems, or for testing multiplayer games locally. VBox is FOSS and works quite well.
    3033
    3134== Debugging Crashes and Assertion Failures ==
     
    3942 * Build environment - custom build, SVN autobuild, or release package? Which compiler version?
    4043 * Hardware (e.g. `system_info.txt`)
    41  * Operating system (e.g. `system.info.txt`)
     44 * Operating system (e.g. `system_info.txt`)
    4245 * Which version of the game was the user playing?
    4346 * What was the user doing when the crash occurred?
     
    5558
    5659 * It can be opened in Visual Studio or !WinDbg. For this to be useful, '''you need to have the debug symbols and source code matching the affected build of the game'''. Most users use the autobuild version of the game, you can simply download the correct autobuilt binaries from SVN. Or for a release, install that particular version of the game and acquire the matching source package from [http://releases.wildfiregames.com/]. In the future we should automate this process, see #290.
    57  * You also need to set up symbol paths and a cache location for Microsoft symbol server, see [http://support.microsoft.com/kb/311503 Use the Microsoft Symbol Server to obtain debug symbol files].
     60 * You also need to set up symbol paths and a cache location for Microsoft symbol server, see [https://docs.microsoft.com/en-us/windows/win32/dxtecharts/debugging-with-symbols Use the Microsoft Symbol Server to obtain debug symbol files].
    5861 * In Visual Studio, after setting up your debug symbol paths, open the `crashlog.dmp` and choose to debug natively. You should get a crash of some kind, then you can break into the debugger. The call stack window will show you which functions were being called at that point. Note that in release builds, some data will be optimized out and not easily viewable.
    5962 * In !WinDbg, after setting up your debug symbol and source code paths, open the crash dump and use the `~*kp` command to get a full call stack of each thread. See [http://www.windbg.info/doc/1-common-cmds.html this extremely helpful article] for more useful commands in !WinDbg. For example, `.frame 3` lets you set the current stack frame to !#3 (the 3rd from the top of the call stack), then you can e.g. use the source code window to see exactly the line of code matching this function call, and locals window to see the variables in that function. Note that !WinDbg can often open dump files that VS fails to open.
     
    6568 * A third option on newer versions of Windows is to have the user create a memory dump from Windows task manager. The user can find `pyrogenesis.exe` in the task manager, right-click it and choose '''Create Dump File'''. Beware the resulting `MEMORY.DMP` will be very large as it contains all memory pages being accessed by the process at the time, but it may be compressed with e.g. [http://www.7-zip.org/ 7-Zip] down to a more reasonable size.
    6669
    67 === Call stack on Linux / OS X ===
    68 On Linux or OS X, you won't have a crash dump, so the best tool for getting your call stack is gdb. Like Windows, you need symbols to be set up properly to make sense of what gdb tells you. If you just see a bunch of hex numbers (addresses of functions) and no names of functions, then you don't have symbols set up correctly. If you're using a release package of the game on Linux, the symbols may have been omitted to reduce the package size, but there may be an optional debug package that can be installed (e.g. on Debian and Ubuntu). Note: debug symbols do NOT require a ''debug build'' of the game. A debug build just disables optimizations to make some debugging easier, both a release build also includes debug symbols.
     70=== Call stack on Linux / macOS ===
     71On Linux or macOS, you won't have a crash dump, so the best tool for getting your call stack is gdb. Like Windows, you need symbols to be set up properly to make sense of what gdb tells you. If you just see a bunch of hex numbers (addresses of functions) and no names of functions, then you don't have symbols set up correctly. If you're using a release package of the game on Linux, the symbols may have been omitted to reduce the package size, but there may be an optional debug package that can be installed (e.g. on Debian and Ubuntu). Note: debug symbols do NOT require a ''debug build'' of the game. A ''debug build'' disables optimizations to make some debugging easier (at the expense of running ''very'' slowly), but '''a release build also includes debug symbols,''' optionally.
    6972
    70 Once you have debug symbols, if you can reproduce the crash, run the game in gdb (e.g. `gdb ./pyrogenesis`, or use [http://lldb.llvm.org/lldb-gdb.html lldb] on recent OS X) and make it crash. Then you will return to the gdb command line. The most helpful command is `bt` to get the backtrace (call stack) at the moment of the crash. Often it's even more helpful to use `t a a bt full` to get the full backtrace of all running threads in the game. `set height 0` is useful if gdb keeps prompting you to continue, when viewing a long backtrace.
     73Once you have debug symbols, if you can reproduce the crash, run the game in gdb (e.g. `gdb ./pyrogenesis`, or use [https://lldb.llvm.org/use/map.html lldb] on recent macOS) and make it crash. Then you will return to the gdb command line. The most helpful command is `bt` to get the backtrace (call stack) at the moment of the crash. Often it's even more helpful to use `t a a bt full` to get the full backtrace of all running threads in the game. `set height 0` is useful if gdb keeps prompting you to continue, when viewing a long backtrace.
    7174
    7275gdb is quite powerful and has more features than can be reasonably explained here, so check the [http://sourceware.org/gdb/current/onlinedocs/gdb/ manual] or search for tutorials. One thing you might find useful is selecting the current stack frame with `frame n`, e.g. `frame 0` is the top of the stack. Then you can use `info locals` to view local variables in that stack frame. Note that in a release build, many of these will be optimized out, or the structure may be too complex for gdb to understand.
     
    7679
    7780 * Debug symbols can contain a lot of data (10+ MB each is not uncommon), and most users aren't interested in debugging software, so often the symbols are omitted from release packages. This is very common with Linux packages.
    78  * For Windows builds, WFG manages the distribution of the game via SVN and alpha releases. Some but not all debug symbols are distributed as PDB files, which Visual Studio and !WinDbg can read. These are generated and committed by the autobuild process.
    79    * Symbols for Windows libraries are distributed by Microsoft and can be acquired from their [http://support.microsoft.com/kb/311503 public symbol server].
     81 * For Windows builds, WFG manages the distribution of the game via SVN and official releases (see #290, #1684). Some but not all debug symbols are distributed as PDB files, which Visual Studio and !WinDbg can read. These are generated and committed by the autobuild process.
     82   * Symbols for Windows libraries are distributed by Microsoft and can be acquired from their [https://docs.microsoft.com/en-us/windows/win32/dxtecharts/debugging-with-symbols#using-the-microsoft-symbol-server public symbol server].
    8083   * Symbols for proprietary drivers (e.g. graphics drivers) are typically ''not'' publicly distributed.
    81    * If symbols are missing: this is generally the case with the game's libraries (SpiderMonkey, FCollada, NVTT, etc.) if they were built by WFG. In this case, you may have no choice but to rebuild the library in question and the game, then try to reproduce the crash once debug symbols are obtained. (In the future, we should distribute the PDBs for all open source libraries, to aid debugging.)
     84   * If symbols are missing: this is generally the case with the game's libraries (SpiderMonkey, FCollada, NVTT, etc.) if they were built by WFG. In this case, you may have no choice but to rebuild the library in question and the game, then try to reproduce the crash once debug symbols are obtained. See #1684.
    8285 * For Linux builds, the package maintainers handle the distribution of the game. It is up to them to choose how or whether they will distribute the debug symbols.
    8386   * If symbols are missing: first check if there is a "debug" package of the module available. If not, the same advice applies as for Windows: try to build the library with debug symbols and reproduce the crash.
     
    8992Out of sync (OOS) and serialization errors are generally difficult to debug, but knowing where to look can make this process simpler. An OOS error occurs in a multiplayer game when one player's serialized simulation state isn't identical to another player's serialized simulation state (breaking the concept of network synchronization). The following data are useful to collect in this case:
    9093
    91  * `oos_dump.txt` - a human readable snapshot of the simulation state at the point of OOS, created on each player's computer. Found in the logs folder, see GameDataPaths. Each player should zip these files and send them to the person troubleshooting the bug.
    92  * Each player's game version - these have to match, while the game is in alpha phase the simulation changes constantly and there is no backward compatibility. For releases, it means using the same alpha release, for SVN users, it means using the same SVN revision (with few exceptions).
     94 * `oos_dump.txt` - a human readable snapshot of the simulation state at the point of OOS detection, created on each player's computer. Found in the logs folder, see GameDataPaths. Each player should zip these files and send them to the person troubleshooting the bug.
     95 * Each player's game version - these have to match, while the game is in alpha phase the simulation components change constantly and there is no backward compatibility. For releases, it means using the same alpha release, for SVN users, it means using the same SVN revision (with few exceptions).
    9396 * OS and hardware info for each player (`system_info.txt`) - Some serialization bugs are platform specific, so knowing the systems involved is key to reproducing the error.
    9497 * `commands.txt` - this is the commands issued by each player during the game, which can be used to replay the game exactly as it happened.
    9598
    96 The easiest place to begin is doing a simple text diff of the `oos_dump.txt` files to see where they differ. In Windows, you can use the diff tool built into `TortoiseSVN`, `TortoiseGit`, or some other tool, on *nix you simply use `'diff'`. Your diff tool may not like the binary data spit out by the CCmpAIManager component, so you can remove that by editing manually in a text editor. Note that in multiplayer games, a hash of the full simulation state only occurs every 20 turns for performance reasons, so it's possible that the states began diverging earlier!
     99The easiest place to begin is doing a simple text diff of the `oos_dump.txt` files to see where they differ. In Windows, you can use the diff tool built into `TortoiseSVN`, `TortoiseGit`, or some other tool, on *nix you simply use `diff`. Your diff tool may not like the binary data spit out by the CCmpAIManager component, so you can remove that by editing manually in a text editor. Note that in multiplayer games, a hash of the full simulation state only occurs every 20 turns for performance reasons, so it's possible that the states began diverging earlier!
    97100
    98101In the best case, you will see a small diff of changes comparing two or more of the dump files. This can point you to the component and property that differ, then by analyzing the code that writes to that property, you can see if it does anything unsafe, or (in the case of a C++ component) if it's not serialized correctly.
     
    102105Even OOS dumps with many differences can be useful. Differences between local entities (previews, waypoint flags) can be disregarded, as they don't affect the serialized game state. If you notice one component is especially affected or a component of a recently added feature, then it's often possible (and desirable) to construct a much simpler test case. An OOS in an hour long multiplayer game is complicated, but an OOS or serialization failure on a blank map with a handful of entities is much simpler.
    103106
    104 On linux you can safely start two instances of 0 A.D., host with one and rejoin with the other. This way you can test whether a certain behavior triggers an out-of-sync error.
     107On Linux you can safely start two instances of 0 A.D., host with one and rejoin with the other. This way you can test whether a certain behavior triggers an out-of-sync error.
    105108If you do that, you can use [attachment:t3339_command_line_option_ooslog_unique_v1.2.patch:ticket:3339 this patch] to prevent the oos_dump.txt from being overwritten, allowing you to compare the newly produced oos-dumps.
     109
     110Otherwise, you can use virtual machines in e.g. !VirtualBox to test mutiplayer games on a single machine. This is especially useful for testing cross-platform issues e.g. a Windows host and Linux or macOS client.
    106111
    107112=== OOS caused by the AI ===