Effective C++
Mark Veltzer
mailto:mark@veltzer.net
Micro optimizations
Micro optimizations
● This presentation is about using the C++ language
effectively.
● This means that we will not be talk...
What we will not be discussing
● There are two very important optimizations that could
be considered more important than m...
Reasons against micro
optimizations
● Some people consider micro optimizations an evil.
● They claim that since bottle nec...
Reasons against micro
optimizations (cont)
● Even if your bottle neck is within your code you are better
off putting effor...
Reasons against micro
optimizations – less agility
● Some claim that micro optimizations make your
software less agile.
● ...
Reasons against micro
optimizations - micro optimizations
as premature optimization
● “We should forget about small effici...
Reasons for micro optimization
● The main point: You need to write your code anyway
– so you may as well use the right fea...
Reasons for micro optimization
● Sometimes you are willing to live with the downsides
of micro optimization (both technica...
Reasons for micro optimization
(cont)
● Algorithmic and Architectural changes are just
too difficult.
● Architectural chan...
Micro optimizations
● If you are coding in C++ then you, albeit maybe
unconsciously, agree that too much low level
optimiz...
Micro optimizations and the
compiler
● The compiler does lots of optimizations which
could be classified as micro optimiza...
Micro optimizations and the
compiler
● This means that the micro optimizations we are
talking about here are optimizations...
Micro optimizations - conclusions
● Should you use micro optimizations? Yes.
● How many of them? Depends on your product (...
C++ efficiency (comparison to assembly/C)
Measuring efficiency
● There are many ways to measure efficiency.
● The various methods may give somewhat
different result...
Measuring efficiency
● Measurement methods include:
● Measuring number of instructions in assembly
code.
● Measuring numbe...
Measuring CPU cycles
● Is one way to measure efficiency.
● In RISC chips this is more accurate since every
instruction is ...
What does measuring CPU cycles
mean – assembly programmers?
● That even an assembly programmer writing
down instruction ex...
C complications – function call
● A function call in C may involve many cycles as
it needs to push the arguments, then jum...
C complications – function call
(cont)
● On the other hand if the inner function is inside
a header file and the compiler ...
C complications – switch statement
● A single switch statement in C can be realized
by a compiler in several ways:
● A ser...
C complications - registers
● In C the programmer may advise the compiler about
using a register for a variable using the ...
C complications – cache lines
● When programmers start programming in C
they are told nothing about cache lines, cache
mis...
What does all of this mean? - c
programmers
● That even when programming in C the
programmer only has vague idea of how hi...
What about C++, and higher level?
● In C++ things get even nastier.
● Various basic building blocks like constructor,
dest...
Criticisms of C++ performance (low
level)
● Low level people criticize C++ for being too
slow.
● As a result they say they...
Criticisms of C++ performance (high
level)
● High level people criticize C++ for being too
slow.
● They claim that C++ for...
Is C++ performance bad?
● No.
● It really depends on the programmer.
● And on the programmers knowledge of what
C++ does b...
Is C++ performance bad? (cont)
● Since C++ is a more complicated language it can
be abused by programmers who do not know ...
Is C++ performance bad? (cont)
● So C++ sometimes performs slower and sometimes
faster than equivalent systems implemented...
So what should we do?
● Remember the example of the C function call
and the C switch statement?
● C++ has many more of the...
Comparison of effectiveness issues
in C++ vs other languages
● Other languages, like Java, also have effectiveness issues....
C++ and over design
● Over design is a situation where a programmer
creates too much infrastructure in a piece of
software...
C++ and over design (cont)
● Over design rests on the assumption that you
must make your software flexible for future
requ...
C++ and over design (cont)
● Mature and young products are different in
respect to over design and young products tend
to ...
C++ and over design (cont)
● All you need to do to solve the problem is
simply avoid over design.
● This is very close to ...
C++ and over design (cont)
● Instead agile development puts an emphasis on
the development process:
● Keep code simple.
● ...
Is efficiency important?
Reasons for efficiency
● Even if hardware advances competition
between software vendors remain. Since they
use the same ha...
Reasons for efficiency (cont)
● CPU speed improvement has slowed down and
given way to multi-core/multi-CPU designs.
● Thi...
Reasons for efficiency (cont)
● Last but not least is the act of writing software.
● A simple error in programming could c...
What type of efficiency?
What type of efficiency do we
mean?
● Binary size: we don't care about that much
these days (data much larger than code).
...
What type of efficiency do we
mean?
● Code efficiency is usually split into:
● Latency/response time/responsiveness
● Thro...
The fast path
Fast path: definition
● Most software has a lot of code which is only used, if ever,
one time or rarely.
● This includes s...
Fast path optimizations
● Micro optimizations, and any other
optimizations for that matter, only make sense if
you actuall...
Some basics first
How is a class implemented?
● A class is just like a struct with compile time
checking of access (private, public, protect...
How is a class implemented? (cont)
● The minimum size of a class is 1 byte.
● This is to make sure that memory really is
a...
How is inheritance implemented?
● Inherited classes are just like structs layed out
one after the other.
● So if B is deri...
Inheriting from empty classes
● An empty class takes 1 byte.
● But if you define a class derived from it and
adding an int...
How is multi-inheritance
implemented?
● If C is derived from A and B then C has it's own
fields, A's fields and B's fields...
Virtual methods
● If a class has at least one virtual method then
the size of the object grows by exactly one
pointer.
● T...
Virtual methods (cont)
● The compiler vendors could have held pointers
of more vtables inside a vtable but most of
them ch...
Virtual methods (cont)
● You can think of the vtable pointer as the type
of the instance the difference being that classes...
Virtual methods (cont)
● This means that adding the first virtual method to a class
adds a pointer to the size of the inst...
RTTI
● The ability of the system to report on the
dynamic type of an object and to provide
information about that type at ...
RTTI - typeid
● typeid – A function that can be given a type and
returns std::typeinfo which describes the type.
● You can...
RTTI – typeid (cont)
● In order to work with typeid you must include
<typeinfo>.
● Typeid sometimes works at compile time ...
RTTI - dynamic_cast
● Allows you to try and cast an object to some
type at runtime.
● Because it is done at runtime it may...
RTTI – should you use it?
● Good design usually means not needing to use RTTI
which is also good from a performance standp...
Virtual destructor
● Is required if you:
● Want to destroy the class using delete.
● At the delete point the pointer may p...
In-lining
In-lining as C++ efficiency
● C++ efficiency is based on compile time optimization.
● But the compiler cannot optimize wha...
Problems of in-lining
● The problem of in-lining is that you need to let the
compiler see more of your code.
● This means ...
Advantages of in-lining
● Avoiding yet another function call. Function call is not
cheap since it uses the stack RAM.
● Av...
Advantages of in-lining
● Allows the compiler to reorder instructions
between the outer and inner functions and so
utilize...
Advantages of in-lining
● Allows the compiler to know whether pointers
passed to the inner function are really changed
by ...
In-lining - conclusions
● In-lining provides many, many advantages to C++
programmers.
● Obviously you have to compile wit...
Efficiency topics
Fast construction and destruction
● Consider the need to hold a resource which
must be released (mutex, file,...)
● A comm...
Fast construction and destruction
● Do not allocate on the heap – do it on the stack.
● If you do use the stack then there...
Fast construction and destruction
● Avoid virtual functions in the object in order to
avoid the extra virtual pointer assi...
Fast construction and destruction
● Remember that constructors construct all
members if not explicitly initialized using
m...
Lazy construction
● If some fast paths do not require everything that
you do inside the constructor then avoid calling
tha...
Redundant construction
● When ever an object is composed of other
objects and the constructor is triggered it
creates the ...
Avoid creation of temporary
variables
● If you have an assignment of two types and an
exact assignment operator is missing...
Virtual functions cost
● The assignment of the vptr.
● The extra dereference in method calls.
● The inability of the compi...
Virtual functions cost – when should
you care?
● The inability to in-line is indeed problematic but
remember that the func...
Virtual functions cost – solution 1
● Avoid the virtual function at all.
● You may not really need virtual on it (are you
...
Virtual functions cost – solution 2
● Pass your class to the using class as a
template parameter.
● Templates are more per...
References
References
● Efficient C++ Performance Programming
Techniques – Dov Bulka, David Mayhew.
● Effective C++ - Third Edition –...
Upcoming SlideShare
Loading in …5
×

Effective cplusplus

1,014 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,014
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Effective cplusplus

  1. 1. Effective C++ Mark Veltzer mailto:mark@veltzer.net
  2. 2. Micro optimizations
  3. 3. Micro optimizations ● This presentation is about using the C++ language effectively. ● This means that we will not be talking about how long other parts, which are not your code, are taking. ● Always remember that for a lot of systems the real bottle neck is not within your code but in some other place (too much IO, heavily loaded database, network,...). ● That is why the types of advice in this presentation are sometimes called micro optimizations
  4. 4. What we will not be discussing ● There are two very important optimizations that could be considered more important than micro optimizations that we will not be discussing: ● Algorithmic optimizations. ● Architectural optimizations. ● These could bring tremendous improvements to your code but have less to do with the technicalities of the C++ language. ● In reality implementing them does require knowledge of the C++ language but not at the micro optimization level.
  5. 5. Reasons against micro optimizations ● Some people consider micro optimizations an evil. ● They claim that since bottle necks are not in your code then you shouldn't pay attention to these types of optimizations since: ● The ease of implementing micro optimizations gives you the feeling that you improved the software even though the bottle neck is really somewhere else and in-fact you just complicated your code with no benefit which means your situation actually got worse. ● They make your code harder to maintain and read. ● They require you to learn something which is not important.
  6. 6. Reasons against micro optimizations (cont) ● Even if your bottle neck is within your code you are better off putting effort into either algorithmic or architectural (including multi-core) changes. ● Although these are harder to implement the claim is that per time spent they give better returns. ● Micro optimizations distract you from doing architectural and algorithmic changes which are more important. ● Micro optimizations hide the algorithm or architecture in a sea of details which makes understanding and modifying the software difficult.
  7. 7. Reasons against micro optimizations – less agility ● Some claim that micro optimizations make your software less agile. ● Because they make each piece of code more difficult they make the entire software harder to change at the architectural level. ● This means that the software making process will be slower to respond to new requirements. ● This could have business implications as well as technical ones.
  8. 8. Reasons against micro optimizations - micro optimizations as premature optimization ● “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil" – Donald Knuth ● There is a connection between micro optimizations and premature optimization because in a lot of cases micro optimizations are done prematurely. ● So when first coding your system – Keep It Simple Stupid – which means avoiding unnecessary micro optimizations.
  9. 9. Reasons for micro optimization ● The main point: You need to write your code anyway – so you may as well use the right features and get better performance while you are at it. ● If the language has two features and you use the one which is giving you better optimization then you gain more efficiency without complicating your code. ● Sometimes you are CPU bound and micro optimizations could squeeze more performance out of an existing code base.
  10. 10. Reasons for micro optimization ● Sometimes you are willing to live with the downsides of micro optimization (both technical and business). ● Sometimes, just sometimes, your design is good and you think that it may stick around for a while and micro optimizations just make it better. ● In these cases the loss of agility is not a big problem as it seems. ● This happens in mature products more than in young ones.
  11. 11. Reasons for micro optimization (cont) ● Algorithmic and Architectural changes are just too difficult. ● Architectural changes sometimes means huge rewrites which companies rarely can handle because of organizational issues. ● Algorithmic changes require lots of theoretical knowledge and sometimes low level details (turning a lock using algorithm into lock free requires understanding memory and machine barriers...)
  12. 12. Micro optimizations ● If you are coding in C++ then you, albeit maybe unconsciously, agree that too much low level optimization is evil. ● Otherwise you would be coding in C or assembly. Or maybe designing a dedicate piece of hardware? ● Some people think so highly against micro optimizations that they go for higher level languages like Java/C#/Python/Ruby which prevent these types of optimizations by design.
  13. 13. Micro optimizations and the compiler ● The compiler does lots of optimizations which could be classified as micro optimizations: ● In-lining. ● Loop unrolling. ● Constant folding. ● Dead code elimination. ● Common subexpression elimination. ● Loop invariants. ● And many more...
  14. 14. Micro optimizations and the compiler ● This means that the micro optimizations we are talking about here are optimizations that the compiler cannot perform. ● In practice, however, it is not always simple to know which optimizations are performed by the compiler. ● Plus compilers tend to become better with time. ● And you may need to support several compilers that have a different feature set.
  15. 15. Micro optimizations - conclusions ● Should you use micro optimizations? Yes. ● How many of them? Depends on your product (is it mature, is it's design going to change etc). ● What types of optimizations should you prefer? Ones that don't make your code harder to read and understand and ones that are not already performed by the compiler. ● When to avoid them at all? When your product is young and you are not sure about the design at all. Then they are just a waste of your time.
  16. 16. C++ efficiency (comparison to assembly/C)
  17. 17. Measuring efficiency ● There are many ways to measure efficiency. ● The various methods may give somewhat different results. ● They are also not equally easy to use. ● They sometimes are in conflict one with the other. ● For instance: you may increase latency and decrease average response time. ● And all measures depend on the dataset used.
  18. 18. Measuring efficiency ● Measurement methods include: ● Measuring number of instructions in assembly code. ● Measuring number of instructions issued at run time using CPU counters. ● Measuring time elapsed using CPU counters. ● Measuring time elapsed using hardware clock. ● Measuring time elapsed using OS clock. ● Full profiling in software or hardware.
  19. 19. Measuring CPU cycles ● Is one way to measure efficiency. ● In RISC chips this is more accurate since every instruction is executed in just one cycle. ● In CISC chips things are more complicated. One instruction may be over in one clock while others may last many clocks. ● And instructions are sometimes halted to wait for external factors: main memory, other CPU atomic operation, etc...
  20. 20. What does measuring CPU cycles mean – assembly programmers? ● That even an assembly programmer writing down instruction explicitly cannot know how long his/her code will run. ● This is contrary to common opinion that “assembly programmers know exactly how long their code runs”. ● It is true that assembly programmers have a firmer grasp on how long their code runs.
  21. 21. C complications – function call ● A function call in C may involve many cycles as it needs to push the arguments, then jump to the address and then the inner function needs to get the arguments from the stack, do the calculation, put the result onto the stack and then the outer function will get the result from the stack. ● There are quite a few machine instructions involved in all of these.
  22. 22. C complications – function call (cont) ● On the other hand if the inner function is inside a header file and the compiler is an optimizing one it could eliminate the function call altogether. ● And if the arguments are constant the compiler may eliminate the calculation too and just hard code the result. ● So, a function call may range anywhere between one instruction and hundreds or more even for a simple function.
  23. 23. C complications – switch statement ● A single switch statement in C can be realized by a compiler in several ways: ● A series of compare-and-branch statements. ● An array of jump locations. ● A perfect hash of jump locations. ● Actually good compilers use all of these combined when optimizing. ● So how long does a “switch” statement take? I donno!
  24. 24. C complications - registers ● In C the programmer may advise the compiler about using a register for a variable using the register keyword. ● The programmer, however, does not know if the compiler actually manages to store the variable in a register and does not get feedback about it (aside from disassembly, of course). ● The difference in performance is great as access to register and to memory (even cache memory) is about order of magnitude in difference.
  25. 25. C complications – cache lines ● When programmers start programming in C they are told nothing about cache lines, cache misses etc. ● After a while they are told that structures accessed heavily should be small as to fit in a cache line so as to hit the CPU cache and not main memory. ● And after a while more they are told about how synchronization between CPUs works.
  26. 26. What does all of this mean? - c programmers ● That even when programming in C the programmer only has vague idea of how his/her program is translated into machine code. ● The programmer has to try and wrap his/her head around various low level concepts (registers, switches, caches) in order to understand the speed of his/her program. ● So analyzing speed is not simple, even in C.
  27. 27. What about C++, and higher level? ● In C++ things get even nastier. ● Various basic building blocks like constructor, destructor, virtual method call, exception, copy constructor are even harder to estimate. ● So the feel of the programmer for the runtime gets more vague. ● And in Java the programmer doesn't even know about machine instructions...:)
  28. 28. Criticisms of C++ performance (low level) ● Low level people criticize C++ for being too slow. ● As a result they say they'd use assembly or C over C++. ● The claim is that C++ does too much behind the programmers back ● This prevents the programmer from getting a good grasp of what is happening performance wise.
  29. 29. Criticisms of C++ performance (high level) ● High level people criticize C++ for being too slow. ● They claim that C++ forces the programmer to focus too much on low level details. ● This make the design suffer and be suboptimal. ● Which in turn leads to bad performance.
  30. 30. Is C++ performance bad? ● No. ● It really depends on the programmer. ● And on the programmers knowledge of what C++ does behind the scenes. ● If the programmer knows when and what C++ does behind his/her back then he/she knows how to handle the situation. ● This is exactly what this presentation is for.
  31. 31. Is C++ performance bad? (cont) ● Since C++ is a more complicated language it can be abused by programmers who do not know it's features in a more creative manner and create systems that perform more poorly than systems implemented in lower level languages (C, assembly). ● But C++ is usually thrown at big problems where development in C is more difficult and requires object oriented approaches. In these cases C++ abuse is less of an issue since C abuse in 5large systems is an issue also.
  32. 32. Is C++ performance bad? (cont) ● So C++ sometimes performs slower and sometimes faster than equivalent systems implemented in lower level languages. ● If the system you are implementing is a large one then using object oriented and C++ has an advantage over C since it is more agile and will allow you to reach a good design with less rewrites. ● Since rewriting is problematic in a corporate environment C++ could actually outperform C in such environments.
  33. 33. So what should we do? ● Remember the example of the C function call and the C switch statement? ● C++ has many more of these (virtual functions, exceptions, constructors, destructions, templates and more). ● We should study the language to understand exactly when and what the language does. ● And understand how to use the features effectively.
  34. 34. Comparison of effectiveness issues in C++ vs other languages ● Other languages, like Java, also have effectiveness issues. ● The difference is at the level at which the programmer needs to peek in order to understand the issues involved. ● In Java or other higher level languages effectiveness books, courses and presentations deal with higher level considerations than in C++. ● Issue such as CPU cache lines never appear...:) ● Some see this as a weakness of C++.
  35. 35. C++ and over design ● Over design is a situation where a programmer creates too much infrastructure in a piece of software, much of which is never used. ● Over design is problematic because it makes the code larger and more complicated and thus making it harder to understand, maintain and develop further. ● Over design also makes your software run slower for no good reason. ● Over design is a notorious problem in object oriented languages and is usually inflicted by beginner programmers.
  36. 36. C++ and over design (cont) ● Over design rests on the assumption that you must make your software flexible for future requirements now, regardless if that flexibility will be used in the future. ● Experience shows that programmers make their program flexible where it is convenient for them to do so and not where future requirements will demand that flexibility. ● This means that over design is rarely a good thing.
  37. 37. C++ and over design (cont) ● Mature and young products are different in respect to over design and young products tend to exhibit this illness more. ● C++ has potential for over design. ● Since over design means lower performance then C++ has potential for lower performance in that respect. ● C has less over design issues and assembly even less because they are harder to code.
  38. 38. C++ and over design (cont) ● All you need to do to solve the problem is simply avoid over design. ● This is very close to the concept of agile development. ● Agile development tries to reconcile the two following statements: ● You don't know the requirements of a piece of software in advance. ● You want to avoid over design.
  39. 39. C++ and over design (cont) ● Instead agile development puts an emphasis on the development process: ● Keep code simple. ● Keep the programming environment simple. ● Make it as easy as possible to change and test the code. ● This makes the adjustment to new features easy without over design. ● So one solution to the problem of C++ and over design is agile development.
  40. 40. Is efficiency important?
  41. 41. Reasons for efficiency ● Even if hardware advances competition between software vendors remain. Since they use the same hardware the winner will be the one with more software efficiency. ● Communication speed has exceeded CPU speed and so the software is the new bottleneck. Due to faster optical and other communication solutions it will stay that way.
  42. 42. Reasons for efficiency (cont) ● CPU speed improvement has slowed down and given way to multi-core/multi-CPU designs. ● This is due to various low level physical problems which are not likely to be lifted in the near future. ● But multi-core/multi-CPU solutions are based on software that is specifically written in order to utilize such systems and so software efficiency is once again the bottle neck.
  43. 43. Reasons for efficiency (cont) ● Last but not least is the act of writing software. ● A simple error in programming could cause a piece of software to run 100 times slower than it could. ● Example: NTPL in Linux made thread creation a 1000 times faster. ● Software is still key to efficiency.
  44. 44. What type of efficiency?
  45. 45. What type of efficiency do we mean? ● Binary size: we don't care about that much these days (data much larger than code). ● Compile time: Most people don't care about this but maybe you should? ● Data size: yes, but C++ data structures tend to be quite efficient and RAM is cheap these days. ● Code efficiency: This is the one which most issues relate to.
  46. 46. What type of efficiency do we mean? ● Code efficiency is usually split into: ● Latency/response time/responsiveness ● Throughput. ● These goals are a trade off: increasing one will tend to decrease the other. ● You have to decide where you are in balance between these two opposing objectives. ● Being somewhere in the middle is fine as long as you define clearly exactly where you are.
  47. 47. The fast path
  48. 48. Fast path: definition ● Most software has a lot of code which is only used, if ever, one time or rarely. ● This includes setup, tear down, error handing and more. ● More than 90% of code is like that (sometimes a lot more). ● This is the “boring” code or the “slow path”. ● Fast path is the “non boring” code that actually gets used most of the time. ● One function can include both slow path and fast path code (e.g. it gets used often but has lots of error handling code).
  49. 49. Fast path optimizations ● Micro optimizations, and any other optimizations for that matter, only make sense if you actually use them a lot ● This means that the fast path should receive most of the love and care. ● Optimizing code outside of the fast path will not provide better performance and will just complicate your code making it harder to read/understand/modify/adjust to new requirements.
  50. 50. Some basics first
  51. 51. How is a class implemented? ● A class is just like a struct with compile time checking of access (private, public, protected). ● It is packed like a struct and you can tell the compiler the offsets of members (although you rarely have to). ● The methods are not within the class. They are nowhere to be found in the class. The compiler arranges for the right object to be passed using compile time type checking. Virtual methods are a different matter.
  52. 52. How is a class implemented? (cont) ● The minimum size of a class is 1 byte. ● This is to make sure that memory really is allocated. ● sizeof() cannot also return 0. ● Think of sizeof(array)/sizeof(instance) if sizeof(instance)==0...
  53. 53. How is inheritance implemented? ● Inherited classes are just like structs layed out one after the other. ● So if B is derived from A then sizeof (B)=sizeof(A)+sizeof(fields of B). ● Different compiler vendors can pack A before B or vice versa. This is an ABI issue. ● Different compiler vendors can decide on different alignment between the A and the B part of the object. This is an ABI issue.
  54. 54. Inheriting from empty classes ● An empty class takes 1 byte. ● But if you define a class derived from it and adding an integer you will find that the size of the derived class is 4 and not 5. ● This is because the compiler knows that the 1 byte in the empty class really doesn't refer to data. ● So inheriting from many empty classes doesn't make your class bigger.
  55. 55. How is multi-inheritance implemented? ● If C is derived from A and B then C has it's own fields, A's fields and B's fields (unless it's virtual inheritance). ● The physical layout of these fields is again an ABI issue that is implemented differently by different compiler vendors.
  56. 56. Virtual methods ● If a class has at least one virtual method then the size of the object grows by exactly one pointer. ● That pointer points to a structure called a vtable. ● There is exactly one of these in RAM for each class. ● The table holds the method pointers for all virtual methods in that class.
  57. 57. Virtual methods (cont) ● The compiler vendors could have held pointers of more vtables inside a vtable but most of them choose to create one flattened vtable per class. ● The order of the functions within the table is compiler dependent ABI issue. It could be order of declaration, alphabetical order, whatever (as long as it is consistent). ● Each call to a virtual method goes through this table.
  58. 58. Virtual methods (cont) ● You can think of the vtable pointer as the type of the instance the difference being that classes without virtual methods don't have this “type” information attached to them. ● This is not the same as RTTI which adds info pointed to by the vtables.
  59. 59. Virtual methods (cont) ● This means that adding the first virtual method to a class adds a pointer to the size of the instance. ● Adding a virtual method to a parent class will add that pointer to all of it's decedents. ● Virtual methods are one dereference slower to call to. ● And they cannot be in lined unless the compiler can determine the exact type which is rarely the case. ● Any constructor sets up the pointer to the vtable which makes the constructor slower.
  60. 60. RTTI ● The ability of the system to report on the dynamic type of an object and to provide information about that type at runtime (as opposed to at compile time) ● When utilized consistently can be a powerful tool to ease the work of the programmer in managing resources. ● Comes at the cost of a little performance. ● Includes typeid and dynamic_cast
  61. 61. RTTI - typeid ● typeid – A function that can be given a type and returns std::typeinfo which describes the type. ● You can use this return value for the name of the type (.name()) or to compare type information from two instances. ● According to the manual the name of the instance is implementation (ABI) specific. ● Furthermore, typeid is only guaranteed to work on instances which belong to classes that have at vtable (meaning at least one virtual method).
  62. 62. RTTI – typeid (cont) ● In order to work with typeid you must include <typeinfo>. ● Typeid sometimes works at compile time and sometimes leaves code that understands the type at runtime. ● Most compilers allow you to work without typeid and get a little bit more performance (in terms of code and size of vtables). ● In gcc this is the -fno-rtti flag.
  63. 63. RTTI - dynamic_cast ● Allows you to try and cast an object to some type at runtime. ● Because it is done at runtime it may fail at runtime and your code should include a check for this. ● Works by traversing the vtables of the objects so it is slow and depends on the number of classes in your hierarchy.
  64. 64. RTTI – should you use it? ● Good design usually means not needing to use RTTI which is also good from a performance standpoint. ● typeid returns name() as a string and strings are slow to compare so never use that to determine what to do with an instance. ● dynamic_cast is slow (as explained) and thus should not be used in the fast path. ● Inheritance with virtual methods is pretty fast and should be used to distinguish between types instead.
  65. 65. Virtual destructor ● Is required if you: ● Want to destroy the class using delete. ● At the delete point the pointer may point to a derived element. ● Different classes in your hierarchy would like to do different things in the destructor. ● This is not the same as having virtual methods. ● You may have virtual methods and not need a virtual destructor. ● You may have no virtual methods and need a virtual destructor.
  66. 66. In-lining
  67. 67. In-lining as C++ efficiency ● C++ efficiency is based on compile time optimization. ● But the compiler cannot optimize what it cannot see. ● So a lot of patterns in C++ optimization are based on letting the compiler see more of your code. ● This makes the compiler see through a lot of the abstractions that we need as developers and still do only the things that need to get done. ● We maintain a layered software which is good for maintenance while the compiler does the right thing. Good deal.
  68. 68. Problems of in-lining ● The problem of in-lining is that you need to let the compiler see more of your code. ● This means you have to write more in h files. ● Which makes the compilation longer. ● And modifications to implementation now causes long project recompiles. ● This is similar to the problems of templates in C++. ● Only use when you really need the performance boost.
  69. 69. Advantages of in-lining ● Avoiding yet another function call. Function call is not cheap since it uses the stack RAM. ● Avoiding storing registers into ram before the functions call and collecting them after the call. Since RAM access is long compared to register access then this is a very good speedup. ● Better allocation of registers since the allocation is combined between the inner and outer functions. ● Allow the compiler to do dead code elimination based on constants passed to the inner function.
  70. 70. Advantages of in-lining ● Allows the compiler to reorder instructions between the outer and inner functions and so utilize the hardware better. ● Less code executed and so less instruction cache misses. ● Allows the compiler to do loop invariant optimization if the inner function is called from a loop in the outer.
  71. 71. Advantages of in-lining ● Allows the compiler to know whether pointers passed to the inner function are really changed by it. If they are not (as in many cases) the compiler can avoid load and stores to that address. This is even in the absence of correct const statements. ● A function call is an optimization barrier in the outer function and it is now removed.
  72. 72. In-lining - conclusions ● In-lining provides many, many advantages to C++ programmers. ● Obviously you have to compile with optimization to get those advantages. ● And write code in header files. ● When you do optimize in-lining occurs even in non header files. ● A barrier to in-lining is calling an unknown function which usually happens when calling a virtual function. So try to avoid those in the fast path.
  73. 73. Efficiency topics
  74. 74. Fast construction and destruction ● Consider the need to hold a resource which must be released (mutex, file,...) ● A common pattern is to put this resource into a class that releases the resource upon destruction. ● But if this pattern is used in the fast path then we must make sure that constructors and destructors will not hurt our performance.
  75. 75. Fast construction and destruction ● Do not allocate on the heap – do it on the stack. ● If you do use the stack then there is no “delete with unknown type” and so less need for a virtual destructor. ● Try to avoid the need for a virtual destructor in order to avoid the need to go through the virtual table and the function call. ● This does not mean you cannot use inheritance.
  76. 76. Fast construction and destruction ● Avoid virtual functions in the object in order to avoid the extra virtual pointer assignment in the constructor. ● In line the class by putting it into a header file. ● This will make sure that the construction and destruction will not trigger a function call which is expensive. ● Try to avoid composition in this case as it may create even more work which involves the heap.
  77. 77. Fast construction and destruction ● Remember that constructors construct all members if not explicitly initialized using member construction syntax. ● And their parents... ● And so do destructors. ● Try to keep fast classes as simple as possible.
  78. 78. Lazy construction ● If some fast paths do not require everything that you do inside the constructor then avoid calling that work in the constructor. ● Example could be some inner member composition initialization. ● Instead only construct them lazily when needed. ● This requires careful coding that makes sure that the member is not used unless initialized.
  79. 79. Redundant construction ● When ever an object is composed of other objects and the constructor is triggered it creates the other objects using their default constructors. ● Only then is the constructor run. ● This means that if you initialize the member only within your construction you are paying for redundant construction. ● Instead initialize with Class::Class():attr(args)
  80. 80. Avoid creation of temporary variables ● If you have an assignment of two types and an exact assignment operator is missing then it may be fulfilled by constructing a temporary out of the right hand side and then calling the assignment operator. ● This may be expensive. ● Try to make sure you have the right assignment operators in place to avoid these costly constructions of temporaries.
  81. 81. Virtual functions cost ● The assignment of the vptr. ● The extra dereference in method calls. ● The inability of the compiler to in-line the function. ● The cost of the first two seems small. The cost of the third is unknown and may be large depending on the circumstances.
  82. 82. Virtual functions cost – when should you care? ● The inability to in-line is indeed problematic but remember that the function you did not inline could be in-lined internally. ● This means that what you really care about is the call/return cost+more in-lining that is possible out of seeing the two pieces of code together. ● If your function is long then you may not care about this at all – so you should care only if your function should be very fast.
  83. 83. Virtual functions cost – solution 1 ● Avoid the virtual function at all. ● You may not really need virtual on it (are you over designing?) ● Write two classes instead of one (do you really want to mix them or are you just using one in any particular run?) ● Use a macro to decide which to use in each run?!?
  84. 84. Virtual functions cost – solution 2 ● Pass your class to the using class as a template parameter. ● Templates are more performance friendly than inheritance hierarchies. ● They push type resolution from runtime to compile time (which we consider to be free). ● If the template class is in-line ready then it will be in-lined in the using class.
  85. 85. References
  86. 86. References ● Efficient C++ Performance Programming Techniques – Dov Bulka, David Mayhew. ● Effective C++ - Third Edition – 55 Specific Ways to Improve Your Programs and Designs – Scott Meyers. ● More Effective C++ - 35 New Ways to Improve Your Programs and Designs – Scott Meyers. ● The C++ FAQ: http://www.parashift.com/c++- faq-lite/

×