• Like
Code Tuning
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Published

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
726
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
20
Comments
1
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. for(i=0;i<count;i++) { if(SumType==Net) { NetSum=NetSum+Amount[i]; } else /*SumType==Gross*/ { GrossSum=GrossSum+Amount[i]; } } if(SumType==Net) { for(i=0;i<count;i++) { NetSum=NetSum+Amount[i]; } } else/*SumType==Gross*/ { for(i=0;i<count;i++) { GrossSum=GrossSum+Amount[i]; } } Can you make any difference??
  • 2. What did you observe???
    • In the first code, the test if(SumType==Net) is repeated through each iteration even though it will be the same each time through the loop.
    • The latter, un-switches the loop by making the decision outside the loop.
  • 3. Lets see how the speed varies… C 1.81 1.43 21% Basic 4.07 3.30 19% Language Straight Code-Tuned Time Time Time Savings
  • 4. Thus you can optimize your code using Code Tuning Techniques!!! Code Tuning refers to change in the implementation of a design rather than changing the design itself.
  • 5. The main purpose of this presentation is to illustrate a handful of techniques that you can adapt to your situation.
    • Unswitching.
    • Unrolling.
    • Jam loops
    • Cache frequently used values.
    • Initialize data at compile time.
    • Use sentinels in search loops .
    • Putting the busiest loop on the inside.
    • Pre-compute results.
    • Use integers instead of floating point variables
  • 6.
    • Un-Switching:
    • The example discussed earlier refer to this technique.
    • The hazard in this case is that the two loops have to be maintained in parallel.If the count changes you have to change it in both the places which can be handled by introducing another variable,using it in the loop, and assigning it to NetSum or GrossSum after the loop.
    • Unrolling:
    • The goal of loop unrolling is to reduce the amount of loop housekeeping.
    • In the following example a loop was completely unrolled, and when the number of lines were increased it was observed to be faster.
  • 7. i :=1; while(i<=Num)loop a(i):=1; i:=i+1; end loop i:=1; while(i<Num)loop a(i):=1; a(i+1):=i+1; i:=i+2; end loop if(i=Num)then a(Num):=Num; end if; Ada 1.04 0.82 21% C 1.59 1.15 28% Language Straight Code-Tuned Time Time Time Savings The measured difference in performance: ADA examples
  • 8. When 5 lines of straightforward code expand to 9 lines of tricky code, the code becomes harder to read and maintain.Except for the gain in speed, its quality is poor. Further loop unrolling can result in further time savings.
    • Cache frequently used values:
    Caching means saving a few values in such a way that you can retrieve the most commonly used values more easily than the less commonly used values. For example in the following code, caching of the results of time-consuming computations is done as the parameters to the calculation are simple.
  • 9.
    • function Hypotenuse
    • (
    • SideA: real;
    • SideB: real
    • ): real;
    • begin
    • Hypotenuse :=sqrt(SideA*sideA+SideB*SideB);
    • end;
    • function Hypotenuse
    • (
    • SideA: real;
    • SideB: real
    • ): real;
    • const
    • CachedHypotenuse: real=0;
    • CachedSideA: real=0;
    • CachedSideB: real=0;
    • begin
    • If((SideA=CachedSideA)and
    • (SideB=CachedSideB))then
    • begin
    • Hypotenuse :=CachedHypotenuse;
    • exit;
    • end;
    • CachedHypotenuse:=sqrt(SideA*sideA+SideB*SideB);
    • CachedSideA :=SideA;
    • CachedSideB :=SideB;
    • Hypotenuse :=CachedHypotenuse;
    • end;
    Pascal examples of caching
  • 10. Language Straight Code-Tuned Time Time Time Savings Pascal 2.63 2.25 14% C 1.99 1.93 3% The speed difference between the above two versions: The second version of the routine is more complicated than the first and takes up more space. So speed has to be at a premium to justify it.
    • The success of the cache depends on:
    • The relative costs of accessing a cached element,creating an uncached element and saving a new element in the cache.
    • How often the cached information is requested.
    • Caching done by the hardware.
  • 11.
    • Initialization at compile time:
    If a named constant or a magic number is being used in a routine call and it’s the only argument, that’s a clue that you could pre-compute the number, put it into a constant, and avoid the routine call. C Example of Log-Base-Two Routine based on System Routine unsigned int log2(unsigned int x) { return((unsigned int)(log(x)/log(2))); } C Example of Log-Base-Two Routine based on System Routine and a constant unsigned int log2(unsigned int x) { return((unsigned int)(log(x)/LOG2)); } Note:LOG2 is a named constant equal to 0.69314718 .
  • 12. Language Straight Code-Tuned Time Time Time Savings Pascal 2.63 2.25 14% C 1.99 1.93 3%
    • Sentinel values:
    Sentinel value is the one that you put just past the end of search engine and that’s guaranteed to terminate the search. Using these values you can simplify a loop with compound tests and save time.
  • 13.
    • C Example of Compound Tests in a Search Loop
    • Found=FALSE;
    • i=0;
    • while((!Found)&&(i<ElementCount))
    • {
    • if(Item[i]==TestValue)
    • Found=TRUE;
    • else
    • i++;
    • }
    • if(Found)
    • C Example of using Sentinel Value to Speed Up a Loop
    • InitialValue=Item[ElementCount];
    • Item[ElementCount]=TestValue;
    • i=0;
    • while(Item[i]!=TestValue)
    • {
    • i++;
    • }
    • Item[ElementCount]=InitialValue;
    • If(i<ElementCount)
    • ...
  • 14.
    • In the first code, each iteration of the loop tests for not !Found and for i<ElementCount and inside the loop it tests whether the desired element has been found.
    • In the second code we can combine the 3 tests so that you test only once for iteration by putting a sentinel at the end of the search range to stop the loop.
    Language Straight Code-Tuned Time Time Time Savings C++ 6.21 4.45 28% Pascal 0.21 0.10 52% Basic 6.65 0.60 91% Note : you must choose the sentinel value carefully and must be careful about how to put it into the array or linked list.
  • 15.
    • Putting the busiest loop on the inside:
    When you have nested loops,think about which loop you want on the outside and which you want on the inside. Here is an example of a nested loop that can be improved. Pascal example of a nested loop that can be improved. for column :=1 to 100 do begin for row :=1 to 5 do begin sum := sum+ Table[ row, column] end end;
    • Each time the loop executes, it has to initialize the loop index, increment it on each pass through the loop, and check it after each pass.
    • The total number of loop executions=100+(100*5)=600.
    • When the inner and outer loops are switched,
    • The total number of loop executions=5+(100*5)=505.
  • 16. The measured difference in performance: Language Straight Code-Tuned Time Time Time Savings Pascal 2.53 2.42 4% Ada 2.14 2.03 5%
    • Use integers instead of floating point variables:
    • As we know, integer addition and multiplication tend to be much faster than floating point, at least when floating point is implemented in software rather than hardware.
    • Like when we just change the loop index from a floating point to an integer, can save time.
  • 17. Basic Example of a Loop That Uses a Time-Consuming Floating-Point for i = 1 to 100 x( i ) = 0 next i Basic Example of a Loop That Uses a Timesaving Integer Loop Index for i% = 1 to 100 x( i% ) = 0 next i% Language Straight Code-Tuned Time Time Time Savings Pascal 2.53 2.42 4% Ada 2.14 2.03 5% How much difference does it make??
  • 18.
    • Pre-compute results
    • A common low-level design decision is the choice of whether to compute results on the fly or compute them once , save them , and look them up as needed.
    • If the results are used many times , its often cheaper to compute them once and look them up.
    • At the simplest level , you might compute part of an expression outside a loop rather than inside.
    Language Straight Code-Tuned Time Time Time Savings Pascal 3.13 0.82 74% Ada 9.39 2.09 78% The measured difference in performance:
  • 19. function ComputePayment ( LoanAmount: longint; Months: integer; InterestState: real; ): real; begin Payment :=LoanAmount/ ( (1.0-Power(1.0+(InterestRate/12.0),-Months))/(InterestRate/12.0) ) end; function ComputePayment ( LoanAmount: longint; Months: integer; InteresRate: real; ):real; var InterestIdx : integer; begin InterestIdx := Round((InterestRate-LOWEST_RATE)*GRANULARITY*100.0); Payment :=LoanAmount/LoanDivisor[InterestIdx,Months] end; Pascal example on precomputing results
  • 20. Fundamental Rules Code Simplification: Most fast programs are simple. Therefore, keep code simple to make it faster. Problem Simplification: To increase efficiency of a program, simplify the problem it solves. Relentless Suspicion: Question the necessity of each instruction in a time-critical piece of code and each field in a space-critical data structure. Early Binding: Move work forward in time. Specifically, do work now just once in hope of avoiding doing it many times later. Conclusion
  • 21.
    • Code tuning is a little like nuclear energy.It's a controversial, emotional topic.Some people think it's so detrimental to reliability and maintainability that they won't do it all.Others think that with proper safeguards,it's beneficial.If you decide to use the techniques discussed,apply them with utmost care.
    KeyPoints
    • Results of optimizations vary widely with different languages,compilers,and environments.Without measuring a specific optimization,you'll have no idea whether it will help or hurt your program.
    • The first optimization is often not the best.Even after you find a good one,keep looking for one that's better.