Check out the new USENIX Web site. next up previous
Next: Virtual Methods Up: Performance Previous: Performance

Branching and Method Calls

Compilers are good enough that sequential sections of code are optimized very well and branches are the bulk of the execution time and code space. These are the logic of the program, and cannot usually be eliminated. The optimizer can be helped by avoiding conditional code, short loops should be unrolled, and it may be best to not branch around useless code in certain cases.

Object oriented designs, and C++ programs in particular, tend to introduce many smaller functions that perform near-trivial operations. For example:

class Foo {
        [...]
      int value()
          { return value_; }
      unsigned size() const
          { return 16; }
        [...]
};

The call to value() can be reduced to a single ``mov'' or ``ld'' instruction on most types of CPUs, and the size() method can be optimized completely away. However, if those methods were not defined inline then the compiler would be forced to generate a call and a return, would need to invalidate registers and/or shift register windows, and otherwise multiply the complixity.

By inlining, the call instruction is eliminated and the basic block is expanded to surround the method invocation. Subexpression elimination and register allocation can be applied more globally and code around the method call shrinks along with the method call.

C programs can also benefit from this technique. Linux source code, for example, is filled with tiny inlined functions of this sort. They are easier to read then similar macros, and more clearly express intent to the compiler.



Stephen Williams
Sun May 4 15:28:26 PDT 1997