• write code using simple and free optimisations as you go
  • Measure and time any code change. (use profilers.)
  • Reuse computations and call results as much as possible.
  • double check if you can put any test inside a loop outside.
  • use empty() over size() when checking if a container is empty
  • make sure any GPU operation you add/do is not requiring as sync of the opengl driver (ie: prefer VBO over VertexArrays)
  • remember inline is a compiler 'hint', as loop unrolls, so in performance hotspot, do it yourself.
  • memory storage is cheap, cpu cycles less so and memory access not at all. So prefer memory duplication over any computation or memory access. ie: Don't use std::map with int index, where you can have a sparse vector indexed structure, which is way much faster, meaning there should be no iteration over a std::map
  • prefer using std::iterators over checking against size() in loop condition. (size is recomputed at each call, unless container is "const")

Memory Performance

Game slowing over time is due to current memory model (nearly everything is stack allocated), which implies a lot of new/malloc/delete/free during runtime, which even shows up on the profiler.

The problem is called memory fragmentation It gives slower perf over time, not only due to slower memory access, but slower malloc/new (time to find free memory block of optimal size grow over time) and can leads to crash ("stack allocation failed"). So as a rule:

  • Do whatever possible to avoid any allocation/free during the frame loop.

First simplest step, avoiding all "hidden" temporary object allocation by following simple rules, which are explained here, and as a recap:

  • Pass Class Parameters by Reference
  • Prefer Initialization over Assignment
  • Use Constructor Initialization Lists
  • Use Prefix operators for objects
  • Use Operator= Instead of Operator Alone
  • Use Explicit Constructors
  • Prefer passing parameter by value whenever possible ( don't return object allocated on the stack, don't ever return containers allocated on the stack).
    ( you could be sure that all compilers does trigger any form of return value optimization ( RVO, NRVO, URVO, opy elision), by checking assembly code generated, but that doesn't change that debug builds will be slower because of not doing the optimization anyway, and impose the burden of checking it if new/old compiler/transpiler is added to compilation supports.)
std::vector<Edge> fillEdges()
	std::vector<Edge> edges;
	 // fill-in code for edges
	return edges;


void getPathEdge(std::vector<Edge> &edges)
// fill-in code for edges
Last modified 4 years ago Last modified on May 12, 2013 7:43:25 AM