SlideShare a Scribd company logo
Effective C++
Mark Veltzer
mailto:mark@veltzer.net
Micro optimizations
Micro optimizations
● This presentation is about using the C++ language
effectively.
● This means that we will not be talking about how long
other parts, which are not your code, are taking.
● Always remember that for a lot of systems the real bottle
neck is not within your code but in some other place (too
much IO, heavily loaded database, network,...).
● That is why the types of advice in this presentation are
sometimes called micro optimizations
What we will not be discussing
● There are two very important optimizations that could
be considered more important than micro
optimizations that we will not be discussing:
● Algorithmic optimizations.
● Architectural optimizations.
● These could bring tremendous improvements to your
code but have less to do with the technicalities of the
C++ language.
● In reality implementing them does require knowledge
of the C++ language but not at the micro optimization
level.
Reasons against micro
optimizations
● Some people consider micro optimizations an evil.
● They claim that since bottle necks are not in your code then you
shouldn't pay attention to these types of optimizations since:
● The ease of implementing micro optimizations gives you the feeling
that you improved the software even though the bottle neck is really
somewhere else and in-fact you just complicated your code with no
benefit which means your situation actually got worse.
● They make your code harder to maintain and read.
● They require you to learn something which is not important.
Reasons against micro
optimizations (cont)
● Even if your bottle neck is within your code you are better
off putting effort into either algorithmic or architectural
(including multi-core) changes.
● Although these are harder to implement the claim is that
per time spent they give better returns.
● Micro optimizations distract you from doing architectural
and algorithmic changes which are more important.
● Micro optimizations hide the algorithm or architecture in a
sea of details which makes understanding and modifying
the software difficult.
Reasons against micro
optimizations – less agility
● Some claim that micro optimizations make your
software less agile.
● Because they make each piece of code more
difficult they make the entire software harder to
change at the architectural level.
● This means that the software making process
will be slower to respond to new requirements.
● This could have business implications as well
as technical ones.
Reasons against micro
optimizations - micro optimizations
as premature optimization
● “We should forget about small efficiencies, say
about 97% of the time: premature optimization
is the root of all evil" – Donald Knuth
● There is a connection between micro
optimizations and premature optimization
because in a lot of cases micro optimizations
are done prematurely.
● So when first coding your system – Keep It
Simple Stupid – which means avoiding
unnecessary micro optimizations.
Reasons for micro optimization
● The main point: You need to write your code anyway
– so you may as well use the right features and get
better performance while you are at it.
● If the language has two features and you use the
one which is giving you better optimization then you
gain more efficiency without complicating your
code.
● Sometimes you are CPU bound and micro
optimizations could squeeze more performance out
of an existing code base.
Reasons for micro optimization
● Sometimes you are willing to live with the downsides
of micro optimization (both technical and business).
● Sometimes, just sometimes, your design is good
and you think that it may stick around for a while and
micro optimizations just make it better.
● In these cases the loss of agility is not a big problem
as it seems.
● This happens in mature products more than in young
ones.
Reasons for micro optimization
(cont)
● Algorithmic and Architectural changes are just
too difficult.
● Architectural changes sometimes means huge
rewrites which companies rarely can handle
because of organizational issues.
● Algorithmic changes require lots of theoretical
knowledge and sometimes low level details
(turning a lock using algorithm into lock free
requires understanding memory and machine
barriers...)
Micro optimizations
● If you are coding in C++ then you, albeit maybe
unconsciously, agree that too much low level
optimization is evil.
● Otherwise you would be coding in C or
assembly. Or maybe designing a dedicate
piece of hardware?
● Some people think so highly against micro
optimizations that they go for higher level
languages like Java/C#/Python/Ruby which
prevent these types of optimizations by design.
Micro optimizations and the
compiler
● The compiler does lots of optimizations which
could be classified as micro optimizations:
● In-lining.
● Loop unrolling.
● Constant folding.
● Dead code elimination.
● Common subexpression elimination.
● Loop invariants.
● And many more...
Micro optimizations and the
compiler
● This means that the micro optimizations we are
talking about here are optimizations that the
compiler cannot perform.
● In practice, however, it is not always simple to
know which optimizations are performed by the
compiler.
● Plus compilers tend to become better with time.
● And you may need to support several compilers
that have a different feature set.
Micro optimizations - conclusions
● Should you use micro optimizations? Yes.
● How many of them? Depends on your product (is
it mature, is it's design going to change etc).
● What types of optimizations should you prefer?
Ones that don't make your code harder to read
and understand and ones that are not already
performed by the compiler.
● When to avoid them at all? When your product is
young and you are not sure about the design at
all. Then they are just a waste of your time.
C++ efficiency (comparison to assembly/C)
Measuring efficiency
● There are many ways to measure efficiency.
● The various methods may give somewhat
different results.
● They are also not equally easy to use.
● They sometimes are in conflict one with the
other.
● For instance: you may increase latency and
decrease average response time.
● And all measures depend on the dataset used.
Measuring efficiency
● Measurement methods include:
● Measuring number of instructions in assembly
code.
● Measuring number of instructions issued at run time
using CPU counters.
● Measuring time elapsed using CPU counters.
● Measuring time elapsed using hardware clock.
● Measuring time elapsed using OS clock.
● Full profiling in software or hardware.
Measuring CPU cycles
● Is one way to measure efficiency.
● In RISC chips this is more accurate since every
instruction is executed in just one cycle.
● In CISC chips things are more complicated.
One instruction may be over in one clock while
others may last many clocks.
● And instructions are sometimes halted to wait
for external factors: main memory, other CPU
atomic operation, etc...
What does measuring CPU cycles
mean – assembly programmers?
● That even an assembly programmer writing
down instruction explicitly cannot know how
long his/her code will run.
● This is contrary to common opinion that
“assembly programmers know exactly how
long their code runs”.
● It is true that assembly programmers have a
firmer grasp on how long their code runs.
C complications – function call
● A function call in C may involve many cycles as
it needs to push the arguments, then jump to
the address and then the inner function needs
to get the arguments from the stack, do the
calculation, put the result onto the stack and
then the outer function will get the result from
the stack.
● There are quite a few machine instructions
involved in all of these.
C complications – function call
(cont)
● On the other hand if the inner function is inside
a header file and the compiler is an optimizing
one it could eliminate the function call
altogether.
● And if the arguments are constant the compiler
may eliminate the calculation too and just hard
code the result.
● So, a function call may range anywhere
between one instruction and hundreds or more
even for a simple function.
C complications – switch statement
● A single switch statement in C can be realized
by a compiler in several ways:
● A series of compare-and-branch statements.
● An array of jump locations.
● A perfect hash of jump locations.
● Actually good compilers use all of these
combined when optimizing.
● So how long does a “switch” statement take? I
donno!
C complications - registers
● In C the programmer may advise the compiler about
using a register for a variable using the register
keyword.
● The programmer, however, does not know if the
compiler actually manages to store the variable in a
register and does not get feedback about it (aside
from disassembly, of course).
● The difference in performance is great as access to
register and to memory (even cache memory) is
about order of magnitude in difference.
C complications – cache lines
● When programmers start programming in C
they are told nothing about cache lines, cache
misses etc.
● After a while they are told that structures
accessed heavily should be small as to fit in a
cache line so as to hit the CPU cache and not
main memory.
● And after a while more they are told about how
synchronization between CPUs works.
What does all of this mean? - c
programmers
● That even when programming in C the
programmer only has vague idea of how his/her
program is translated into machine code.
● The programmer has to try and wrap his/her
head around various low level concepts
(registers, switches, caches) in order to
understand the speed of his/her program.
● So analyzing speed is not simple, even in C.
What about C++, and higher level?
● In C++ things get even nastier.
● Various basic building blocks like constructor,
destructor, virtual method call, exception, copy
constructor are even harder to estimate.
● So the feel of the programmer for the runtime
gets more vague.
● And in Java the programmer doesn't even know
about machine instructions...:)
Criticisms of C++ performance (low
level)
● Low level people criticize C++ for being too
slow.
● As a result they say they'd use assembly or C
over C++.
● The claim is that C++ does too much behind
the programmers back
● This prevents the programmer from getting a
good grasp of what is happening performance
wise.
Criticisms of C++ performance (high
level)
● High level people criticize C++ for being too
slow.
● They claim that C++ forces the programmer to
focus too much on low level details.
● This make the design suffer and be suboptimal.
● Which in turn leads to bad performance.
Is C++ performance bad?
● No.
● It really depends on the programmer.
● And on the programmers knowledge of what
C++ does behind the scenes.
● If the programmer knows when and what C++
does behind his/her back then he/she knows
how to handle the situation.
● This is exactly what this presentation is for.
Is C++ performance bad? (cont)
● Since C++ is a more complicated language it can
be abused by programmers who do not know it's
features in a more creative manner and create
systems that perform more poorly than systems
implemented in lower level languages (C,
assembly).
● But C++ is usually thrown at big problems where
development in C is more difficult and requires
object oriented approaches. In these cases C++
abuse is less of an issue since C abuse in 5large
systems is an issue also.
Is C++ performance bad? (cont)
● So C++ sometimes performs slower and sometimes
faster than equivalent systems implemented in lower
level languages.
● If the system you are implementing is a large one
then using object oriented and C++ has an
advantage over C since it is more agile and will
allow you to reach a good design with less rewrites.
● Since rewriting is problematic in a corporate
environment C++ could actually outperform C in
such environments.
So what should we do?
● Remember the example of the C function call
and the C switch statement?
● C++ has many more of these (virtual functions,
exceptions, constructors, destructions,
templates and more).
● We should study the language to understand
exactly when and what the language does.
● And understand how to use the features
effectively.
Comparison of effectiveness issues
in C++ vs other languages
● Other languages, like Java, also have effectiveness issues.
● The difference is at the level at which the programmer needs
to peek in order to understand the issues involved.
● In Java or other higher level languages effectiveness books,
courses and presentations deal with higher level
considerations than in C++.
● Issue such as CPU cache lines never appear...:)
● Some see this as a weakness of C++.
C++ and over design
● Over design is a situation where a programmer
creates too much infrastructure in a piece of
software, much of which is never used.
● Over design is problematic because it makes the
code larger and more complicated and thus making it
harder to understand, maintain and develop further.
● Over design also makes your software run slower for
no good reason.
● Over design is a notorious problem in object oriented
languages and is usually inflicted by beginner
programmers.
C++ and over design (cont)
● Over design rests on the assumption that you
must make your software flexible for future
requirements now, regardless if that flexibility
will be used in the future.
● Experience shows that programmers make
their program flexible where it is convenient for
them to do so and not where future
requirements will demand that flexibility.
● This means that over design is rarely a good
thing.
C++ and over design (cont)
● Mature and young products are different in
respect to over design and young products tend
to exhibit this illness more.
● C++ has potential for over design.
● Since over design means lower performance
then C++ has potential for lower performance in
that respect.
● C has less over design issues and assembly
even less because they are harder to code.
C++ and over design (cont)
● All you need to do to solve the problem is
simply avoid over design.
● This is very close to the concept of agile
development.
● Agile development tries to reconcile the two
following statements:
● You don't know the requirements of a piece of
software in advance.
● You want to avoid over design.
C++ and over design (cont)
● Instead agile development puts an emphasis on
the development process:
● Keep code simple.
● Keep the programming environment simple.
● Make it as easy as possible to change and test the
code.
● This makes the adjustment to new features easy
without over design.
● So one solution to the problem of C++ and over
design is agile development.
Is efficiency important?
Reasons for efficiency
● Even if hardware advances competition
between software vendors remain. Since they
use the same hardware the winner will be the
one with more software efficiency.
● Communication speed has exceeded CPU
speed and so the software is the new
bottleneck. Due to faster optical and other
communication solutions it will stay that way.
Reasons for efficiency (cont)
● CPU speed improvement has slowed down and
given way to multi-core/multi-CPU designs.
● This is due to various low level physical
problems which are not likely to be lifted in the
near future.
● But multi-core/multi-CPU solutions are based
on software that is specifically written in order
to utilize such systems and so software
efficiency is once again the bottle neck.
Reasons for efficiency (cont)
● Last but not least is the act of writing software.
● A simple error in programming could cause a
piece of software to run 100 times slower than it
could.
● Example: NTPL in Linux made thread creation
a 1000 times faster.
● Software is still key to efficiency.
What type of efficiency?
What type of efficiency do we
mean?
● Binary size: we don't care about that much
these days (data much larger than code).
● Compile time: Most people don't care about this
but maybe you should?
● Data size: yes, but C++ data structures tend to
be quite efficient and RAM is cheap these days.
● Code efficiency: This is the one which most
issues relate to.
What type of efficiency do we
mean?
● Code efficiency is usually split into:
● Latency/response time/responsiveness
● Throughput.
● These goals are a trade off: increasing one will
tend to decrease the other.
● You have to decide where you are in balance
between these two opposing objectives.
● Being somewhere in the middle is fine as long
as you define clearly exactly where you are.
The fast path
Fast path: definition
● Most software has a lot of code which is only used, if ever,
one time or rarely.
● This includes setup, tear down, error handing and more.
● More than 90% of code is like that (sometimes a lot more).
● This is the “boring” code or the “slow path”.
● Fast path is the “non boring” code that actually gets used
most of the time.
● One function can include both slow path and fast path code
(e.g. it gets used often but has lots of error handling code).
Fast path optimizations
● Micro optimizations, and any other
optimizations for that matter, only make sense if
you actually use them a lot
● This means that the fast path should receive
most of the love and care.
● Optimizing code outside of the fast path will not
provide better performance and will just
complicate your code making it harder to
read/understand/modify/adjust to new
requirements.
Some basics first
How is a class implemented?
● A class is just like a struct with compile time
checking of access (private, public, protected).
● It is packed like a struct and you can tell the
compiler the offsets of members (although you
rarely have to).
● The methods are not within the class. They are
nowhere to be found in the class. The compiler
arranges for the right object to be passed using
compile time type checking. Virtual methods
are a different matter.
How is a class implemented? (cont)
● The minimum size of a class is 1 byte.
● This is to make sure that memory really is
allocated.
● sizeof() cannot also return 0.
● Think of sizeof(array)/sizeof(instance) if
sizeof(instance)==0...
How is inheritance implemented?
● Inherited classes are just like structs layed out
one after the other.
● So if B is derived from A then sizeof
(B)=sizeof(A)+sizeof(fields of B).
● Different compiler vendors can pack A before B
or vice versa. This is an ABI issue.
● Different compiler vendors can decide on
different alignment between the A and the B
part of the object. This is an ABI issue.
Inheriting from empty classes
● An empty class takes 1 byte.
● But if you define a class derived from it and
adding an integer you will find that the size of
the derived class is 4 and not 5.
● This is because the compiler knows that the 1
byte in the empty class really doesn't refer to
data.
● So inheriting from many empty classes doesn't
make your class bigger.
How is multi-inheritance
implemented?
● If C is derived from A and B then C has it's own
fields, A's fields and B's fields (unless it's virtual
inheritance).
● The physical layout of these fields is again an
ABI issue that is implemented differently by
different compiler vendors.
Virtual methods
● If a class has at least one virtual method then
the size of the object grows by exactly one
pointer.
● That pointer points to a structure called a
vtable.
● There is exactly one of these in RAM for each
class.
● The table holds the method pointers for all
virtual methods in that class.
Virtual methods (cont)
● The compiler vendors could have held pointers
of more vtables inside a vtable but most of
them choose to create one flattened vtable per
class.
● The order of the functions within the table is
compiler dependent ABI issue. It could be order
of declaration, alphabetical order, whatever (as
long as it is consistent).
● Each call to a virtual method goes through this
table.
Virtual methods (cont)
● You can think of the vtable pointer as the type
of the instance the difference being that classes
without virtual methods don't have this “type”
information attached to them.
● This is not the same as RTTI which adds info
pointed to by the vtables.
Virtual methods (cont)
● This means that adding the first virtual method to a class
adds a pointer to the size of the instance.
● Adding a virtual method to a parent class will add that
pointer to all of it's decedents.
● Virtual methods are one dereference slower to call to.
● And they cannot be in lined unless the compiler can
determine the exact type which is rarely the case.
● Any constructor sets up the pointer to the vtable which
makes the constructor slower.
RTTI
● The ability of the system to report on the
dynamic type of an object and to provide
information about that type at runtime (as
opposed to at compile time)
● When utilized consistently can be a powerful
tool to ease the work of the programmer in
managing resources.
● Comes at the cost of a little performance.
● Includes typeid and dynamic_cast
RTTI - typeid
● typeid – A function that can be given a type and
returns std::typeinfo which describes the type.
● You can use this return value for the name of the
type (.name()) or to compare type information
from two instances.
● According to the manual the name of the
instance is implementation (ABI) specific.
● Furthermore, typeid is only guaranteed to work
on instances which belong to classes that have
at vtable (meaning at least one virtual method).
RTTI – typeid (cont)
● In order to work with typeid you must include
<typeinfo>.
● Typeid sometimes works at compile time and
sometimes leaves code that understands the
type at runtime.
● Most compilers allow you to work without typeid
and get a little bit more performance (in terms
of code and size of vtables).
● In gcc this is the -fno-rtti flag.
RTTI - dynamic_cast
● Allows you to try and cast an object to some
type at runtime.
● Because it is done at runtime it may fail at
runtime and your code should include a check
for this.
● Works by traversing the vtables of the objects
so it is slow and depends on the number of
classes in your hierarchy.
RTTI – should you use it?
● Good design usually means not needing to use RTTI
which is also good from a performance standpoint.
● typeid returns name() as a string and strings are slow
to compare so never use that to determine what to do
with an instance.
● dynamic_cast is slow (as explained) and thus should
not be used in the fast path.
● Inheritance with virtual methods is pretty fast and
should be used to distinguish between types instead.
Virtual destructor
● Is required if you:
● Want to destroy the class using delete.
● At the delete point the pointer may point to a derived
element.
● Different classes in your hierarchy would like to do different
things in the destructor.
● This is not the same as having virtual methods.
● You may have virtual methods and not need a virtual
destructor.
● You may have no virtual methods and need a virtual
destructor.
In-lining
In-lining as C++ efficiency
● C++ efficiency is based on compile time optimization.
● But the compiler cannot optimize what it cannot see.
● So a lot of patterns in C++ optimization are based on letting the
compiler see more of your code.
● This makes the compiler see through a lot of the abstractions that
we need as developers and still do only the things that need to get
done.
● We maintain a layered software which is good for maintenance while
the compiler does the right thing. Good deal.
Problems of in-lining
● The problem of in-lining is that you need to let the
compiler see more of your code.
● This means you have to write more in h files.
● Which makes the compilation longer.
● And modifications to implementation now causes
long project recompiles.
● This is similar to the problems of templates in C++.
● Only use when you really need the performance
boost.
Advantages of in-lining
● Avoiding yet another function call. Function call is not
cheap since it uses the stack RAM.
● Avoiding storing registers into ram before the
functions call and collecting them after the call. Since
RAM access is long compared to register access
then this is a very good speedup.
● Better allocation of registers since the allocation is
combined between the inner and outer functions.
● Allow the compiler to do dead code elimination based
on constants passed to the inner function.
Advantages of in-lining
● Allows the compiler to reorder instructions
between the outer and inner functions and so
utilize the hardware better.
● Less code executed and so less instruction
cache misses.
● Allows the compiler to do loop invariant
optimization if the inner function is called from a
loop in the outer.
Advantages of in-lining
● Allows the compiler to know whether pointers
passed to the inner function are really changed
by it. If they are not (as in many cases) the
compiler can avoid load and stores to that
address. This is even in the absence of correct
const statements.
● A function call is an optimization barrier in the
outer function and it is now removed.
In-lining - conclusions
● In-lining provides many, many advantages to C++
programmers.
● Obviously you have to compile with optimization to
get those advantages.
● And write code in header files.
● When you do optimize in-lining occurs even in non
header files.
● A barrier to in-lining is calling an unknown function
which usually happens when calling a virtual
function. So try to avoid those in the fast path.
Efficiency topics
Fast construction and destruction
● Consider the need to hold a resource which
must be released (mutex, file,...)
● A common pattern is to put this resource into a
class that releases the resource upon
destruction.
● But if this pattern is used in the fast path then
we must make sure that constructors and
destructors will not hurt our performance.
Fast construction and destruction
● Do not allocate on the heap – do it on the stack.
● If you do use the stack then there is no “delete
with unknown type” and so less need for a
virtual destructor.
● Try to avoid the need for a virtual destructor in
order to avoid the need to go through the virtual
table and the function call.
● This does not mean you cannot use
inheritance.
Fast construction and destruction
● Avoid virtual functions in the object in order to
avoid the extra virtual pointer assignment in the
constructor.
● In line the class by putting it into a header file.
● This will make sure that the construction and
destruction will not trigger a function call which is
expensive.
● Try to avoid composition in this case as it may
create even more work which involves the heap.
Fast construction and destruction
● Remember that constructors construct all
members if not explicitly initialized using
member construction syntax.
● And their parents...
● And so do destructors.
● Try to keep fast classes as simple as possible.
Lazy construction
● If some fast paths do not require everything that
you do inside the constructor then avoid calling
that work in the constructor.
● Example could be some inner member
composition initialization.
● Instead only construct them lazily when
needed.
● This requires careful coding that makes sure
that the member is not used unless initialized.
Redundant construction
● When ever an object is composed of other
objects and the constructor is triggered it
creates the other objects using their default
constructors.
● Only then is the constructor run.
● This means that if you initialize the member
only within your construction you are paying for
redundant construction.
● Instead initialize with Class::Class():attr(args)
Avoid creation of temporary
variables
● If you have an assignment of two types and an
exact assignment operator is missing then it
may be fulfilled by constructing a temporary out
of the right hand side and then calling the
assignment operator.
● This may be expensive.
● Try to make sure you have the right assignment
operators in place to avoid these costly
constructions of temporaries.
Virtual functions cost
● The assignment of the vptr.
● The extra dereference in method calls.
● The inability of the compiler to in-line the
function.
● The cost of the first two seems small. The cost
of the third is unknown and may be large
depending on the circumstances.
Virtual functions cost – when should
you care?
● The inability to in-line is indeed problematic but
remember that the function you did not inline
could be in-lined internally.
● This means that what you really care about is
the call/return cost+more in-lining that is
possible out of seeing the two pieces of code
together.
● If your function is long then you may not care
about this at all – so you should care only if
your function should be very fast.
Virtual functions cost – solution 1
● Avoid the virtual function at all.
● You may not really need virtual on it (are you
over designing?)
● Write two classes instead of one (do you really
want to mix them or are you just using one in
any particular run?)
● Use a macro to decide which to use in each
run?!?
Virtual functions cost – solution 2
● Pass your class to the using class as a
template parameter.
● Templates are more performance friendly than
inheritance hierarchies.
● They push type resolution from runtime to
compile time (which we consider to be free).
● If the template class is in-line ready then it will
be in-lined in the using class.
References
References
● Efficient C++ Performance Programming
Techniques – Dov Bulka, David Mayhew.
● Effective C++ - Third Edition – 55 Specific Ways
to Improve Your Programs and Designs – Scott
Meyers.
● More Effective C++ - 35 New Ways to Improve
Your Programs and Designs – Scott Meyers.
● The C++ FAQ: http://www.parashift.com/c++-
faq-lite/

More Related Content

What's hot

extreme Programming
extreme Programmingextreme Programming
extreme Programming
Bilal Shah
 
eXtreme programming (XP) - An Overview
eXtreme programming (XP) - An OvervieweXtreme programming (XP) - An Overview
eXtreme programming (XP) - An Overview
Gurtej Pal Singh
 
extreme programming
extreme programmingextreme programming
extreme programming
himanshumunjal
 
Introduction To Extreme Programming
Introduction To Extreme ProgrammingIntroduction To Extreme Programming
Introduction To Extreme Programming
Joe Drumgoole
 
Extreme Programming (XP) for Dummies
Extreme Programming (XP) for DummiesExtreme Programming (XP) for Dummies
Extreme Programming (XP) for Dummies
Jon McNestrie
 
Extreme Programming (XP) Project Process Flow Chart
Extreme Programming (XP)  Project Process Flow ChartExtreme Programming (XP)  Project Process Flow Chart
Extreme Programming (XP) Project Process Flow Chart
carbrac
 
Software demoguidlines
Software demoguidlinesSoftware demoguidlines
Software demoguidlines
alxides
 
Agile Practices - eXtreme Programming
Agile Practices - eXtreme ProgrammingAgile Practices - eXtreme Programming
Agile Practices - eXtreme Programming
Aniruddha Chakrabarti
 
How to hire top software engineers
How to hire top software engineersHow to hire top software engineers
How to hire top software engineers
Rick Mac Gillis
 
Xp exterme-programming-model
Xp exterme-programming-modelXp exterme-programming-model
Xp exterme-programming-model
Ali MasudianPour
 
Remote Mob Programming
Remote Mob ProgrammingRemote Mob Programming
Remote Mob Programming
Camille Bell
 
Extreme programming
Extreme programmingExtreme programming
Extreme programming
aaina_katyal
 
Kata Your Way to SW Craftsmanship
Kata Your Way to SW CraftsmanshipKata Your Way to SW Craftsmanship
Kata Your Way to SW Craftsmanship
Camille Bell
 
Algorithms
AlgorithmsAlgorithms
Algorithms
AlgorithmsAlgorithms
ABC of Agile (Scrum & Extreme Programming)
ABC of Agile (Scrum & Extreme Programming)ABC of Agile (Scrum & Extreme Programming)
ABC of Agile (Scrum & Extreme Programming)
Amardeep Vishwakarma
 
Going extreme-with-extreme-programming
Going extreme-with-extreme-programmingGoing extreme-with-extreme-programming
Going extreme-with-extreme-programming
Michael Green
 
XP In 10 slides
XP In 10 slidesXP In 10 slides
XP In 10 slides
Robert Burrell Donkin
 
Test Driven Development by Denis Lutz
Test Driven Development by Denis LutzTest Driven Development by Denis Lutz
Test Driven Development by Denis Lutz
jazzman1980
 

What's hot (19)

extreme Programming
extreme Programmingextreme Programming
extreme Programming
 
eXtreme programming (XP) - An Overview
eXtreme programming (XP) - An OvervieweXtreme programming (XP) - An Overview
eXtreme programming (XP) - An Overview
 
extreme programming
extreme programmingextreme programming
extreme programming
 
Introduction To Extreme Programming
Introduction To Extreme ProgrammingIntroduction To Extreme Programming
Introduction To Extreme Programming
 
Extreme Programming (XP) for Dummies
Extreme Programming (XP) for DummiesExtreme Programming (XP) for Dummies
Extreme Programming (XP) for Dummies
 
Extreme Programming (XP) Project Process Flow Chart
Extreme Programming (XP)  Project Process Flow ChartExtreme Programming (XP)  Project Process Flow Chart
Extreme Programming (XP) Project Process Flow Chart
 
Software demoguidlines
Software demoguidlinesSoftware demoguidlines
Software demoguidlines
 
Agile Practices - eXtreme Programming
Agile Practices - eXtreme ProgrammingAgile Practices - eXtreme Programming
Agile Practices - eXtreme Programming
 
How to hire top software engineers
How to hire top software engineersHow to hire top software engineers
How to hire top software engineers
 
Xp exterme-programming-model
Xp exterme-programming-modelXp exterme-programming-model
Xp exterme-programming-model
 
Remote Mob Programming
Remote Mob ProgrammingRemote Mob Programming
Remote Mob Programming
 
Extreme programming
Extreme programmingExtreme programming
Extreme programming
 
Kata Your Way to SW Craftsmanship
Kata Your Way to SW CraftsmanshipKata Your Way to SW Craftsmanship
Kata Your Way to SW Craftsmanship
 
Algorithms
AlgorithmsAlgorithms
Algorithms
 
Algorithms
AlgorithmsAlgorithms
Algorithms
 
ABC of Agile (Scrum & Extreme Programming)
ABC of Agile (Scrum & Extreme Programming)ABC of Agile (Scrum & Extreme Programming)
ABC of Agile (Scrum & Extreme Programming)
 
Going extreme-with-extreme-programming
Going extreme-with-extreme-programmingGoing extreme-with-extreme-programming
Going extreme-with-extreme-programming
 
XP In 10 slides
XP In 10 slidesXP In 10 slides
XP In 10 slides
 
Test Driven Development by Denis Lutz
Test Driven Development by Denis LutzTest Driven Development by Denis Lutz
Test Driven Development by Denis Lutz
 

Similar to Effective cplusplus

OutSystems Tips and Tricks
OutSystems Tips and TricksOutSystems Tips and Tricks
OutSystems Tips and Tricks
OutSystems
 
Deep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software ArchitectureDeep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software Architecture
Matthew Clarke
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++
Andrey Karpov
 
CPP11 - Function Design
CPP11 - Function DesignCPP11 - Function Design
CPP11 - Function Design
Michael Heron
 
On component interface
On component interfaceOn component interface
On component interface
Laurence Chen
 
Practical C++ Generative Programming
Practical C++ Generative ProgrammingPractical C++ Generative Programming
Practical C++ Generative Programming
Schalk Cronjé
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++
PVS-Studio
 
Hexagonal architecture
Hexagonal architectureHexagonal architecture
Hexagonal architecture
Nicolas Guignard
 
Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automation
Ricardo Bánffy
 
Debugging 2013- Lars pedersen
Debugging 2013- Lars pedersenDebugging 2013- Lars pedersen
Debugging 2013- Lars pedersen
Mediehuset Ingeniøren Live
 
Architecture presentation 4
Architecture presentation 4Architecture presentation 4
Architecture presentation 4
Anoushiravan M. Ghamsari
 
Software Bugs A Software Architect Point Of View
Software Bugs    A Software Architect Point Of ViewSoftware Bugs    A Software Architect Point Of View
Software Bugs A Software Architect Point Of View
Shahzad
 
Indy meetup#7 effective unit-testing-mule
Indy meetup#7 effective unit-testing-muleIndy meetup#7 effective unit-testing-mule
Indy meetup#7 effective unit-testing-mule
ikram_ahamed
 
OOSE UNIT-1.pdf
OOSE UNIT-1.pdfOOSE UNIT-1.pdf
OOSE UNIT-1.pdf
KarumuriJayasri
 
Introduction to Embedded Systems
Introduction to Embedded SystemsIntroduction to Embedded Systems
Introduction to Embedded Systems
Sudhanshu Janwadkar
 
Create first android app with MVVM Architecture
Create first android app with MVVM ArchitectureCreate first android app with MVVM Architecture
Create first android app with MVVM Architecture
khushbu thakker
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
SADEED AMEEN
 
Clean architecture
Clean architectureClean architecture
Clean architecture
.NET Crowd
 
From prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.ioFrom prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.io
Máté Lang
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software house
Paris Apostolopoulos
 

Similar to Effective cplusplus (20)

OutSystems Tips and Tricks
OutSystems Tips and TricksOutSystems Tips and Tricks
OutSystems Tips and Tricks
 
Deep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software ArchitectureDeep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software Architecture
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++
 
CPP11 - Function Design
CPP11 - Function DesignCPP11 - Function Design
CPP11 - Function Design
 
On component interface
On component interfaceOn component interface
On component interface
 
Practical C++ Generative Programming
Practical C++ Generative ProgrammingPractical C++ Generative Programming
Practical C++ Generative Programming
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++
 
Hexagonal architecture
Hexagonal architectureHexagonal architecture
Hexagonal architecture
 
Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automation
 
Debugging 2013- Lars pedersen
Debugging 2013- Lars pedersenDebugging 2013- Lars pedersen
Debugging 2013- Lars pedersen
 
Architecture presentation 4
Architecture presentation 4Architecture presentation 4
Architecture presentation 4
 
Software Bugs A Software Architect Point Of View
Software Bugs    A Software Architect Point Of ViewSoftware Bugs    A Software Architect Point Of View
Software Bugs A Software Architect Point Of View
 
Indy meetup#7 effective unit-testing-mule
Indy meetup#7 effective unit-testing-muleIndy meetup#7 effective unit-testing-mule
Indy meetup#7 effective unit-testing-mule
 
OOSE UNIT-1.pdf
OOSE UNIT-1.pdfOOSE UNIT-1.pdf
OOSE UNIT-1.pdf
 
Introduction to Embedded Systems
Introduction to Embedded SystemsIntroduction to Embedded Systems
Introduction to Embedded Systems
 
Create first android app with MVVM Architecture
Create first android app with MVVM ArchitectureCreate first android app with MVVM Architecture
Create first android app with MVVM Architecture
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
 
Clean architecture
Clean architectureClean architecture
Clean architecture
 
From prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.ioFrom prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.io
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software house
 

More from Mark Veltzer

Gcc
GccGcc
Gcc opt
Gcc optGcc opt
Gcc opt
Mark Veltzer
 
Linux io
Linux ioLinux io
Linux io
Mark Veltzer
 
Linux logging
Linux loggingLinux logging
Linux logging
Mark Veltzer
 
Linux monitoring
Linux monitoringLinux monitoring
Linux monitoring
Mark Veltzer
 
Linux multiplexing
Linux multiplexingLinux multiplexing
Linux multiplexing
Mark Veltzer
 
Linux tmpfs
Linux tmpfsLinux tmpfs
Linux tmpfs
Mark Veltzer
 
Multicore
MulticoreMulticore
Multicore
Mark Veltzer
 
Pthreads linux
Pthreads linuxPthreads linux
Pthreads linux
Mark Veltzer
 
Realtime
RealtimeRealtime
Realtime
Mark Veltzer
 
Streams
StreamsStreams
Streams
Mark Veltzer
 
Volatile
VolatileVolatile
Volatile
Mark Veltzer
 

More from Mark Veltzer (12)

Gcc
GccGcc
Gcc
 
Gcc opt
Gcc optGcc opt
Gcc opt
 
Linux io
Linux ioLinux io
Linux io
 
Linux logging
Linux loggingLinux logging
Linux logging
 
Linux monitoring
Linux monitoringLinux monitoring
Linux monitoring
 
Linux multiplexing
Linux multiplexingLinux multiplexing
Linux multiplexing
 
Linux tmpfs
Linux tmpfsLinux tmpfs
Linux tmpfs
 
Multicore
MulticoreMulticore
Multicore
 
Pthreads linux
Pthreads linuxPthreads linux
Pthreads linux
 
Realtime
RealtimeRealtime
Realtime
 
Streams
StreamsStreams
Streams
 
Volatile
VolatileVolatile
Volatile
 

Recently uploaded

Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 

Recently uploaded (20)

Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 

Effective cplusplus

  • 3. Micro optimizations ● This presentation is about using the C++ language effectively. ● This means that we will not be talking about how long other parts, which are not your code, are taking. ● Always remember that for a lot of systems the real bottle neck is not within your code but in some other place (too much IO, heavily loaded database, network,...). ● That is why the types of advice in this presentation are sometimes called micro optimizations
  • 4. What we will not be discussing ● There are two very important optimizations that could be considered more important than micro optimizations that we will not be discussing: ● Algorithmic optimizations. ● Architectural optimizations. ● These could bring tremendous improvements to your code but have less to do with the technicalities of the C++ language. ● In reality implementing them does require knowledge of the C++ language but not at the micro optimization level.
  • 5. Reasons against micro optimizations ● Some people consider micro optimizations an evil. ● They claim that since bottle necks are not in your code then you shouldn't pay attention to these types of optimizations since: ● The ease of implementing micro optimizations gives you the feeling that you improved the software even though the bottle neck is really somewhere else and in-fact you just complicated your code with no benefit which means your situation actually got worse. ● They make your code harder to maintain and read. ● They require you to learn something which is not important.
  • 6. Reasons against micro optimizations (cont) ● Even if your bottle neck is within your code you are better off putting effort into either algorithmic or architectural (including multi-core) changes. ● Although these are harder to implement the claim is that per time spent they give better returns. ● Micro optimizations distract you from doing architectural and algorithmic changes which are more important. ● Micro optimizations hide the algorithm or architecture in a sea of details which makes understanding and modifying the software difficult.
  • 7. Reasons against micro optimizations – less agility ● Some claim that micro optimizations make your software less agile. ● Because they make each piece of code more difficult they make the entire software harder to change at the architectural level. ● This means that the software making process will be slower to respond to new requirements. ● This could have business implications as well as technical ones.
  • 8. Reasons against micro optimizations - micro optimizations as premature optimization ● “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil" – Donald Knuth ● There is a connection between micro optimizations and premature optimization because in a lot of cases micro optimizations are done prematurely. ● So when first coding your system – Keep It Simple Stupid – which means avoiding unnecessary micro optimizations.
  • 9. Reasons for micro optimization ● The main point: You need to write your code anyway – so you may as well use the right features and get better performance while you are at it. ● If the language has two features and you use the one which is giving you better optimization then you gain more efficiency without complicating your code. ● Sometimes you are CPU bound and micro optimizations could squeeze more performance out of an existing code base.
  • 10. Reasons for micro optimization ● Sometimes you are willing to live with the downsides of micro optimization (both technical and business). ● Sometimes, just sometimes, your design is good and you think that it may stick around for a while and micro optimizations just make it better. ● In these cases the loss of agility is not a big problem as it seems. ● This happens in mature products more than in young ones.
  • 11. Reasons for micro optimization (cont) ● Algorithmic and Architectural changes are just too difficult. ● Architectural changes sometimes means huge rewrites which companies rarely can handle because of organizational issues. ● Algorithmic changes require lots of theoretical knowledge and sometimes low level details (turning a lock using algorithm into lock free requires understanding memory and machine barriers...)
  • 12. Micro optimizations ● If you are coding in C++ then you, albeit maybe unconsciously, agree that too much low level optimization is evil. ● Otherwise you would be coding in C or assembly. Or maybe designing a dedicate piece of hardware? ● Some people think so highly against micro optimizations that they go for higher level languages like Java/C#/Python/Ruby which prevent these types of optimizations by design.
  • 13. Micro optimizations and the compiler ● The compiler does lots of optimizations which could be classified as micro optimizations: ● In-lining. ● Loop unrolling. ● Constant folding. ● Dead code elimination. ● Common subexpression elimination. ● Loop invariants. ● And many more...
  • 14. Micro optimizations and the compiler ● This means that the micro optimizations we are talking about here are optimizations that the compiler cannot perform. ● In practice, however, it is not always simple to know which optimizations are performed by the compiler. ● Plus compilers tend to become better with time. ● And you may need to support several compilers that have a different feature set.
  • 15. Micro optimizations - conclusions ● Should you use micro optimizations? Yes. ● How many of them? Depends on your product (is it mature, is it's design going to change etc). ● What types of optimizations should you prefer? Ones that don't make your code harder to read and understand and ones that are not already performed by the compiler. ● When to avoid them at all? When your product is young and you are not sure about the design at all. Then they are just a waste of your time.
  • 16. C++ efficiency (comparison to assembly/C)
  • 17. Measuring efficiency ● There are many ways to measure efficiency. ● The various methods may give somewhat different results. ● They are also not equally easy to use. ● They sometimes are in conflict one with the other. ● For instance: you may increase latency and decrease average response time. ● And all measures depend on the dataset used.
  • 18. Measuring efficiency ● Measurement methods include: ● Measuring number of instructions in assembly code. ● Measuring number of instructions issued at run time using CPU counters. ● Measuring time elapsed using CPU counters. ● Measuring time elapsed using hardware clock. ● Measuring time elapsed using OS clock. ● Full profiling in software or hardware.
  • 19. Measuring CPU cycles ● Is one way to measure efficiency. ● In RISC chips this is more accurate since every instruction is executed in just one cycle. ● In CISC chips things are more complicated. One instruction may be over in one clock while others may last many clocks. ● And instructions are sometimes halted to wait for external factors: main memory, other CPU atomic operation, etc...
  • 20. What does measuring CPU cycles mean – assembly programmers? ● That even an assembly programmer writing down instruction explicitly cannot know how long his/her code will run. ● This is contrary to common opinion that “assembly programmers know exactly how long their code runs”. ● It is true that assembly programmers have a firmer grasp on how long their code runs.
  • 21. C complications – function call ● A function call in C may involve many cycles as it needs to push the arguments, then jump to the address and then the inner function needs to get the arguments from the stack, do the calculation, put the result onto the stack and then the outer function will get the result from the stack. ● There are quite a few machine instructions involved in all of these.
  • 22. C complications – function call (cont) ● On the other hand if the inner function is inside a header file and the compiler is an optimizing one it could eliminate the function call altogether. ● And if the arguments are constant the compiler may eliminate the calculation too and just hard code the result. ● So, a function call may range anywhere between one instruction and hundreds or more even for a simple function.
  • 23. C complications – switch statement ● A single switch statement in C can be realized by a compiler in several ways: ● A series of compare-and-branch statements. ● An array of jump locations. ● A perfect hash of jump locations. ● Actually good compilers use all of these combined when optimizing. ● So how long does a “switch” statement take? I donno!
  • 24. C complications - registers ● In C the programmer may advise the compiler about using a register for a variable using the register keyword. ● The programmer, however, does not know if the compiler actually manages to store the variable in a register and does not get feedback about it (aside from disassembly, of course). ● The difference in performance is great as access to register and to memory (even cache memory) is about order of magnitude in difference.
  • 25. C complications – cache lines ● When programmers start programming in C they are told nothing about cache lines, cache misses etc. ● After a while they are told that structures accessed heavily should be small as to fit in a cache line so as to hit the CPU cache and not main memory. ● And after a while more they are told about how synchronization between CPUs works.
  • 26. What does all of this mean? - c programmers ● That even when programming in C the programmer only has vague idea of how his/her program is translated into machine code. ● The programmer has to try and wrap his/her head around various low level concepts (registers, switches, caches) in order to understand the speed of his/her program. ● So analyzing speed is not simple, even in C.
  • 27. What about C++, and higher level? ● In C++ things get even nastier. ● Various basic building blocks like constructor, destructor, virtual method call, exception, copy constructor are even harder to estimate. ● So the feel of the programmer for the runtime gets more vague. ● And in Java the programmer doesn't even know about machine instructions...:)
  • 28. Criticisms of C++ performance (low level) ● Low level people criticize C++ for being too slow. ● As a result they say they'd use assembly or C over C++. ● The claim is that C++ does too much behind the programmers back ● This prevents the programmer from getting a good grasp of what is happening performance wise.
  • 29. Criticisms of C++ performance (high level) ● High level people criticize C++ for being too slow. ● They claim that C++ forces the programmer to focus too much on low level details. ● This make the design suffer and be suboptimal. ● Which in turn leads to bad performance.
  • 30. Is C++ performance bad? ● No. ● It really depends on the programmer. ● And on the programmers knowledge of what C++ does behind the scenes. ● If the programmer knows when and what C++ does behind his/her back then he/she knows how to handle the situation. ● This is exactly what this presentation is for.
  • 31. Is C++ performance bad? (cont) ● Since C++ is a more complicated language it can be abused by programmers who do not know it's features in a more creative manner and create systems that perform more poorly than systems implemented in lower level languages (C, assembly). ● But C++ is usually thrown at big problems where development in C is more difficult and requires object oriented approaches. In these cases C++ abuse is less of an issue since C abuse in 5large systems is an issue also.
  • 32. Is C++ performance bad? (cont) ● So C++ sometimes performs slower and sometimes faster than equivalent systems implemented in lower level languages. ● If the system you are implementing is a large one then using object oriented and C++ has an advantage over C since it is more agile and will allow you to reach a good design with less rewrites. ● Since rewriting is problematic in a corporate environment C++ could actually outperform C in such environments.
  • 33. So what should we do? ● Remember the example of the C function call and the C switch statement? ● C++ has many more of these (virtual functions, exceptions, constructors, destructions, templates and more). ● We should study the language to understand exactly when and what the language does. ● And understand how to use the features effectively.
  • 34. Comparison of effectiveness issues in C++ vs other languages ● Other languages, like Java, also have effectiveness issues. ● The difference is at the level at which the programmer needs to peek in order to understand the issues involved. ● In Java or other higher level languages effectiveness books, courses and presentations deal with higher level considerations than in C++. ● Issue such as CPU cache lines never appear...:) ● Some see this as a weakness of C++.
  • 35. C++ and over design ● Over design is a situation where a programmer creates too much infrastructure in a piece of software, much of which is never used. ● Over design is problematic because it makes the code larger and more complicated and thus making it harder to understand, maintain and develop further. ● Over design also makes your software run slower for no good reason. ● Over design is a notorious problem in object oriented languages and is usually inflicted by beginner programmers.
  • 36. C++ and over design (cont) ● Over design rests on the assumption that you must make your software flexible for future requirements now, regardless if that flexibility will be used in the future. ● Experience shows that programmers make their program flexible where it is convenient for them to do so and not where future requirements will demand that flexibility. ● This means that over design is rarely a good thing.
  • 37. C++ and over design (cont) ● Mature and young products are different in respect to over design and young products tend to exhibit this illness more. ● C++ has potential for over design. ● Since over design means lower performance then C++ has potential for lower performance in that respect. ● C has less over design issues and assembly even less because they are harder to code.
  • 38. C++ and over design (cont) ● All you need to do to solve the problem is simply avoid over design. ● This is very close to the concept of agile development. ● Agile development tries to reconcile the two following statements: ● You don't know the requirements of a piece of software in advance. ● You want to avoid over design.
  • 39. C++ and over design (cont) ● Instead agile development puts an emphasis on the development process: ● Keep code simple. ● Keep the programming environment simple. ● Make it as easy as possible to change and test the code. ● This makes the adjustment to new features easy without over design. ● So one solution to the problem of C++ and over design is agile development.
  • 41. Reasons for efficiency ● Even if hardware advances competition between software vendors remain. Since they use the same hardware the winner will be the one with more software efficiency. ● Communication speed has exceeded CPU speed and so the software is the new bottleneck. Due to faster optical and other communication solutions it will stay that way.
  • 42. Reasons for efficiency (cont) ● CPU speed improvement has slowed down and given way to multi-core/multi-CPU designs. ● This is due to various low level physical problems which are not likely to be lifted in the near future. ● But multi-core/multi-CPU solutions are based on software that is specifically written in order to utilize such systems and so software efficiency is once again the bottle neck.
  • 43. Reasons for efficiency (cont) ● Last but not least is the act of writing software. ● A simple error in programming could cause a piece of software to run 100 times slower than it could. ● Example: NTPL in Linux made thread creation a 1000 times faster. ● Software is still key to efficiency.
  • 44. What type of efficiency?
  • 45. What type of efficiency do we mean? ● Binary size: we don't care about that much these days (data much larger than code). ● Compile time: Most people don't care about this but maybe you should? ● Data size: yes, but C++ data structures tend to be quite efficient and RAM is cheap these days. ● Code efficiency: This is the one which most issues relate to.
  • 46. What type of efficiency do we mean? ● Code efficiency is usually split into: ● Latency/response time/responsiveness ● Throughput. ● These goals are a trade off: increasing one will tend to decrease the other. ● You have to decide where you are in balance between these two opposing objectives. ● Being somewhere in the middle is fine as long as you define clearly exactly where you are.
  • 48. Fast path: definition ● Most software has a lot of code which is only used, if ever, one time or rarely. ● This includes setup, tear down, error handing and more. ● More than 90% of code is like that (sometimes a lot more). ● This is the “boring” code or the “slow path”. ● Fast path is the “non boring” code that actually gets used most of the time. ● One function can include both slow path and fast path code (e.g. it gets used often but has lots of error handling code).
  • 49. Fast path optimizations ● Micro optimizations, and any other optimizations for that matter, only make sense if you actually use them a lot ● This means that the fast path should receive most of the love and care. ● Optimizing code outside of the fast path will not provide better performance and will just complicate your code making it harder to read/understand/modify/adjust to new requirements.
  • 51. How is a class implemented? ● A class is just like a struct with compile time checking of access (private, public, protected). ● It is packed like a struct and you can tell the compiler the offsets of members (although you rarely have to). ● The methods are not within the class. They are nowhere to be found in the class. The compiler arranges for the right object to be passed using compile time type checking. Virtual methods are a different matter.
  • 52. How is a class implemented? (cont) ● The minimum size of a class is 1 byte. ● This is to make sure that memory really is allocated. ● sizeof() cannot also return 0. ● Think of sizeof(array)/sizeof(instance) if sizeof(instance)==0...
  • 53. How is inheritance implemented? ● Inherited classes are just like structs layed out one after the other. ● So if B is derived from A then sizeof (B)=sizeof(A)+sizeof(fields of B). ● Different compiler vendors can pack A before B or vice versa. This is an ABI issue. ● Different compiler vendors can decide on different alignment between the A and the B part of the object. This is an ABI issue.
  • 54. Inheriting from empty classes ● An empty class takes 1 byte. ● But if you define a class derived from it and adding an integer you will find that the size of the derived class is 4 and not 5. ● This is because the compiler knows that the 1 byte in the empty class really doesn't refer to data. ● So inheriting from many empty classes doesn't make your class bigger.
  • 55. How is multi-inheritance implemented? ● If C is derived from A and B then C has it's own fields, A's fields and B's fields (unless it's virtual inheritance). ● The physical layout of these fields is again an ABI issue that is implemented differently by different compiler vendors.
  • 56. Virtual methods ● If a class has at least one virtual method then the size of the object grows by exactly one pointer. ● That pointer points to a structure called a vtable. ● There is exactly one of these in RAM for each class. ● The table holds the method pointers for all virtual methods in that class.
  • 57. Virtual methods (cont) ● The compiler vendors could have held pointers of more vtables inside a vtable but most of them choose to create one flattened vtable per class. ● The order of the functions within the table is compiler dependent ABI issue. It could be order of declaration, alphabetical order, whatever (as long as it is consistent). ● Each call to a virtual method goes through this table.
  • 58. Virtual methods (cont) ● You can think of the vtable pointer as the type of the instance the difference being that classes without virtual methods don't have this “type” information attached to them. ● This is not the same as RTTI which adds info pointed to by the vtables.
  • 59. Virtual methods (cont) ● This means that adding the first virtual method to a class adds a pointer to the size of the instance. ● Adding a virtual method to a parent class will add that pointer to all of it's decedents. ● Virtual methods are one dereference slower to call to. ● And they cannot be in lined unless the compiler can determine the exact type which is rarely the case. ● Any constructor sets up the pointer to the vtable which makes the constructor slower.
  • 60. RTTI ● The ability of the system to report on the dynamic type of an object and to provide information about that type at runtime (as opposed to at compile time) ● When utilized consistently can be a powerful tool to ease the work of the programmer in managing resources. ● Comes at the cost of a little performance. ● Includes typeid and dynamic_cast
  • 61. RTTI - typeid ● typeid – A function that can be given a type and returns std::typeinfo which describes the type. ● You can use this return value for the name of the type (.name()) or to compare type information from two instances. ● According to the manual the name of the instance is implementation (ABI) specific. ● Furthermore, typeid is only guaranteed to work on instances which belong to classes that have at vtable (meaning at least one virtual method).
  • 62. RTTI – typeid (cont) ● In order to work with typeid you must include <typeinfo>. ● Typeid sometimes works at compile time and sometimes leaves code that understands the type at runtime. ● Most compilers allow you to work without typeid and get a little bit more performance (in terms of code and size of vtables). ● In gcc this is the -fno-rtti flag.
  • 63. RTTI - dynamic_cast ● Allows you to try and cast an object to some type at runtime. ● Because it is done at runtime it may fail at runtime and your code should include a check for this. ● Works by traversing the vtables of the objects so it is slow and depends on the number of classes in your hierarchy.
  • 64. RTTI – should you use it? ● Good design usually means not needing to use RTTI which is also good from a performance standpoint. ● typeid returns name() as a string and strings are slow to compare so never use that to determine what to do with an instance. ● dynamic_cast is slow (as explained) and thus should not be used in the fast path. ● Inheritance with virtual methods is pretty fast and should be used to distinguish between types instead.
  • 65. Virtual destructor ● Is required if you: ● Want to destroy the class using delete. ● At the delete point the pointer may point to a derived element. ● Different classes in your hierarchy would like to do different things in the destructor. ● This is not the same as having virtual methods. ● You may have virtual methods and not need a virtual destructor. ● You may have no virtual methods and need a virtual destructor.
  • 67. In-lining as C++ efficiency ● C++ efficiency is based on compile time optimization. ● But the compiler cannot optimize what it cannot see. ● So a lot of patterns in C++ optimization are based on letting the compiler see more of your code. ● This makes the compiler see through a lot of the abstractions that we need as developers and still do only the things that need to get done. ● We maintain a layered software which is good for maintenance while the compiler does the right thing. Good deal.
  • 68. Problems of in-lining ● The problem of in-lining is that you need to let the compiler see more of your code. ● This means you have to write more in h files. ● Which makes the compilation longer. ● And modifications to implementation now causes long project recompiles. ● This is similar to the problems of templates in C++. ● Only use when you really need the performance boost.
  • 69. Advantages of in-lining ● Avoiding yet another function call. Function call is not cheap since it uses the stack RAM. ● Avoiding storing registers into ram before the functions call and collecting them after the call. Since RAM access is long compared to register access then this is a very good speedup. ● Better allocation of registers since the allocation is combined between the inner and outer functions. ● Allow the compiler to do dead code elimination based on constants passed to the inner function.
  • 70. Advantages of in-lining ● Allows the compiler to reorder instructions between the outer and inner functions and so utilize the hardware better. ● Less code executed and so less instruction cache misses. ● Allows the compiler to do loop invariant optimization if the inner function is called from a loop in the outer.
  • 71. Advantages of in-lining ● Allows the compiler to know whether pointers passed to the inner function are really changed by it. If they are not (as in many cases) the compiler can avoid load and stores to that address. This is even in the absence of correct const statements. ● A function call is an optimization barrier in the outer function and it is now removed.
  • 72. In-lining - conclusions ● In-lining provides many, many advantages to C++ programmers. ● Obviously you have to compile with optimization to get those advantages. ● And write code in header files. ● When you do optimize in-lining occurs even in non header files. ● A barrier to in-lining is calling an unknown function which usually happens when calling a virtual function. So try to avoid those in the fast path.
  • 74. Fast construction and destruction ● Consider the need to hold a resource which must be released (mutex, file,...) ● A common pattern is to put this resource into a class that releases the resource upon destruction. ● But if this pattern is used in the fast path then we must make sure that constructors and destructors will not hurt our performance.
  • 75. Fast construction and destruction ● Do not allocate on the heap – do it on the stack. ● If you do use the stack then there is no “delete with unknown type” and so less need for a virtual destructor. ● Try to avoid the need for a virtual destructor in order to avoid the need to go through the virtual table and the function call. ● This does not mean you cannot use inheritance.
  • 76. Fast construction and destruction ● Avoid virtual functions in the object in order to avoid the extra virtual pointer assignment in the constructor. ● In line the class by putting it into a header file. ● This will make sure that the construction and destruction will not trigger a function call which is expensive. ● Try to avoid composition in this case as it may create even more work which involves the heap.
  • 77. Fast construction and destruction ● Remember that constructors construct all members if not explicitly initialized using member construction syntax. ● And their parents... ● And so do destructors. ● Try to keep fast classes as simple as possible.
  • 78. Lazy construction ● If some fast paths do not require everything that you do inside the constructor then avoid calling that work in the constructor. ● Example could be some inner member composition initialization. ● Instead only construct them lazily when needed. ● This requires careful coding that makes sure that the member is not used unless initialized.
  • 79. Redundant construction ● When ever an object is composed of other objects and the constructor is triggered it creates the other objects using their default constructors. ● Only then is the constructor run. ● This means that if you initialize the member only within your construction you are paying for redundant construction. ● Instead initialize with Class::Class():attr(args)
  • 80. Avoid creation of temporary variables ● If you have an assignment of two types and an exact assignment operator is missing then it may be fulfilled by constructing a temporary out of the right hand side and then calling the assignment operator. ● This may be expensive. ● Try to make sure you have the right assignment operators in place to avoid these costly constructions of temporaries.
  • 81. Virtual functions cost ● The assignment of the vptr. ● The extra dereference in method calls. ● The inability of the compiler to in-line the function. ● The cost of the first two seems small. The cost of the third is unknown and may be large depending on the circumstances.
  • 82. Virtual functions cost – when should you care? ● The inability to in-line is indeed problematic but remember that the function you did not inline could be in-lined internally. ● This means that what you really care about is the call/return cost+more in-lining that is possible out of seeing the two pieces of code together. ● If your function is long then you may not care about this at all – so you should care only if your function should be very fast.
  • 83. Virtual functions cost – solution 1 ● Avoid the virtual function at all. ● You may not really need virtual on it (are you over designing?) ● Write two classes instead of one (do you really want to mix them or are you just using one in any particular run?) ● Use a macro to decide which to use in each run?!?
  • 84. Virtual functions cost – solution 2 ● Pass your class to the using class as a template parameter. ● Templates are more performance friendly than inheritance hierarchies. ● They push type resolution from runtime to compile time (which we consider to be free). ● If the template class is in-line ready then it will be in-lined in the using class.
  • 86. References ● Efficient C++ Performance Programming Techniques – Dov Bulka, David Mayhew. ● Effective C++ - Third Edition – 55 Specific Ways to Improve Your Programs and Designs – Scott Meyers. ● More Effective C++ - 35 New Ways to Improve Your Programs and Designs – Scott Meyers. ● The C++ FAQ: http://www.parashift.com/c++- faq-lite/