Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Practical considerations

Performance
About compiling
Portability

Performance

In theory, all overhead of using STL algorithms and lambda functors compared to hand written loops can be optimized away, just as the overhead from standard STL function objects and binders can. Depending on the compiler, this can also be true in practice. We ran two tests with the GCC 3.0.4 compiler on 1.5 GHz Intel Pentium 4. The optimization flag -03 was used.

In the first test we compared lambda functors against explicitly written function objects. We used both of these styles to define unary functions which multiply the argument repeatedly by itself. We started with the identity function, going up to x5. The expressions were called inside a std::transform loop, reading the argument from one std::vector<int> and placing the result into another. The length of the vectors was 100 elements. The running times are listed in Table 18.3, “Test 1”. We can observe that there is no significant difference between the two approaches.

In the second test we again used std::transform to perform an operation to each element in a 100-element long vector. This time the element type of the vectors was double and we started with very simple arithmetic expressions and moved to more complex ones. The running times are listed in Table 18.4, “Test 2”. Here, we also included classic STL style unnamed functions into tests. We do not show these expressions, as they get rather complex. For example, the last expression in Table 18.4, “Test 2” written with classic STL tools contains 7 calls to compose2, 8 calls to bind1st and altogether 14 constructor invocations for creating multiplies, minus and plus objects. In this test the BLL expressions are a little slower (roughly 10% on average, less than 14% in all cases) than the corresponding hand-written function objects. The performance hit is a bit greater with classic STL expressions, up to 27% for the simplest expressios.

The tests suggest that the BLL does not introduce a loss of performance compared to STL function objects. With a reasonable optimizing compiler, one should expect the performance characteristics be comparable to using classic STL. Moreover, with simple expressions the performance can be expected to be close to that of explicitly written function objects. Note however, that evaluating a lambda functor consist of a sequence of calls to small functions that are declared inline. If the compiler fails to actually expand these functions inline, the performance can suffer. The running time can more than double if this happens. Although the above tests do not include such an expression, we have experienced this for some seemingly simple expressions.

Table 18.3. Test 1

CPU time of expressions with integer multiplication written as a lambda expression and as a traditional hand-coded function object class. The running times are expressed in arbitrary units.
expression lambda expression hand-coded function object
x 240 230
x*x 340 350
x*x*x 770 760
x*x*x*x 1180 1210
x*x*x*x*x 1950 1910


Table 18.4. Test 2

CPU time of arithmetic expressions written as lambda expressions, as classic STL unnamed functions (using compose2, bind1st etc.) and as traditional hand-coded function object classes. Using BLL terminology, a and b are bound arguments in the expressions, and x is open. All variables were of types double. The running times are expressed in arbitrary units.
expression lambda expression classic STL expression hand-coded function object
ax 330 370 290
-ax 350 370 310
ax-(a+x) 470 500 420
(ax-(a+x))(a+x) 620 670 600
((ax) - (a+x))(bx - (b+x))(ax - (b+x))(bx - (a+x)) 1660 1660 1460


Some additional performance testing with an earlier version of the library is described [Jär00].

About compiling

The BLL uses templates rather heavily, performing numerous recursive instantiations of the same templates. This has (at least) three implications:

  • While it is possible to write incredibly complex lambda expressions, it probably isn't a good idea. Compiling such expressions may end up requiring a lot of memory at compile time, and being slow to compile.

  • The types of lambda functors that result from even the simplest lambda expressions are cryptic. Usually the programmer doesn't need to deal with the lambda functor types at all, but in the case of an error in a lambda expression, the compiler usually outputs the types of the lambda functors involved. This can make the error messages very long and difficult to interpret, particularly if the compiler outputs the whole chain of template instantiations.

  • The C++ Standard suggests a template nesting level of 17 to help detect infinite recursion. Complex lambda templates can easily exceed this limit. Most compilers allow a greater number of nested templates, but commonly require the limit explicitly increased with a command line argument.

Portability

The BLL works with the following compilers, that is, the compilers are capable of compiling the test cases that are included with the BLL:

  • GCC 3.0.4
  • KCC 4.0f with EDG 2.43.1
  • GCC 2.96 (fails with one test case, the exception_test.cpp results in an internal compiler error. )

Test coverage

The following list describes the test files included and the features that each file covers:

  • bind_tests_simple.cpp : Bind expressions of different arities and types of target functions: function pointers, function objects and member functions. Function composition with bind expressions.

  • bind_tests_simple_function_references.cpp : Repeats all tests from bind_tests_simple.cpp where the target function is a function pointer, but uses function references instead.

  • bind_tests_advanced.cpp : Contains tests for nested bind expressions, unlambda, protect, const_parameters and break_const. Tests passing lambda functors as actual arguments to other lambda functors, currying, and using the sig template to specify the return type of a function object.

  • operator_tests_simple.cpp : Tests using all operators that are overloaded for lambda expressions, that is, unary and binary arithmetic, bitwise, comparison, logical, increment and decrement, compound, assignment, subscrict, address of, dereference, and comma operators. The streaming nature of shift operators is tested, as well as pointer arithmetic with plus and minus operators.

  • member_pointer_test.cpp : The pointer to member operator is complex enough to warrant a separate test file.

  • control_structures.cpp : Tests for the looping and if constructs.

  • switch_construct.cpp : Includes tests for all supported arities of the switch statement, both with and without the default case.

  • exception_test.cpp : Includes tests for throwing exceptions and for try/catch constructs with varying number of catch blocks.

  • constructor_tests.cpp : Contains tests for constructor, destructor, new_ptr, delete_ptr, new_array and delete_array.

  • cast_test.cpp : Tests for the four cast expressions, as well as typeid and sizeof.

  • extending_return_type_traits.cpp : Tests extending the return type deduction system for user defined types. Contains several user defined operators and the corresponding specializations for the return type deduction templates.

  • is_instance_of_test.cpp : Includes tests for an internally used traits template, which can detect whether a given type is an instance of a certain template or not.

  • bll_and_function.cpp : Contains tests for using boost::function together with lambda functors.


PrevUpHomeNext