Speed-up Windows Image scalar
algorithm
Shereef Shehata
2
Scaler Algorithm Parameters
• Input_fragment  frag_in
• Represents the portion of input pixel available to contribute
towards the production of the output pixel
• Output_fragment  frag_out
• Represent the portion of the input pixel needed to complete
the production of an output pixel
• Accumulator  Acc
• Represent the temporary storage of the input pixel
contribution
• Scale Factor  Scale
• Inverse Scale Factor  Inv_scale
Scaler Algorithm Initialization
Frag_in = In_frag_input
frag_out = 1/S = Inv_scale
Acc = 0
3
4
Scaler Algorithm: Input Pixel Consumption Cycle
• Required Condition to be an Input cycle
• frag_in < frag_out
• frag_in < frag_out
• The condition for an input pixel consumption cycle is that the
available input pixel fragment, frag_in, is less than the
required input fragment to produce an output pixel, frag_out
5
Input Pixel Consumption Cycle
• The input pixel will be used up
• src_row or src_col is incremented
• The current input pixel value is multiplied by frag_in
and added to the accumulator
• Frag_out is updated to reflect the value of frag_in
consumed
• The frag_in value is initialized to 1.0 to imply the fetch
of a new input pixel.
• The next input pixel is fetched
• The algorithm returns to the comparison state
between frag_in and frag_out
Input Pixel Consumption Cycle
Condition for Input Pixel Consumption Cycle
Frag_in < Frag_out
Update frag_out and Initizalize frag_int to reflect
Input Pixel Consumption
frag_out = frag_out - frag_in;
frag_in = 1.0;
Update the Accumulators
R_Acc = R_Acc + R * frag_in
Fetch a new input Pixel
src_row = src_row + 1
6
7
Scaler Algorithm: Output Pixel Production Cycle
• Required Condition to be an Output cycle
• frag_out < frag_in
• frag_out < frag_in
• The condition for an output pixel production cycle is that the
required input pixel fragment, frag_out is less than the
available input fragment to produce an output pixel, frag_in
8
Output Pixel Production Cycle
• The output pixel will be completed
• The current input pixel value is multiplied by
frag_out and added to the accumulator
• Frag_in is updated to reflect the value of frag_out
consumed
• The frag_out value is initialized to Inv_scale to imply
the start of the computation of a new output pixel.
• The contents of the accumulator is scaled by Scale
to produce the output pixel.
• The Accumulator is zeroed out
• The algorithm returns to the comparison state
between frag_in and frag_out
Output Pixel Production Cycle
Condition for Output Pixel Production Cycle
Frag_out < Frag_in
Update frag_in and Initizalize frag_out to reflect
Output Pixel Generation
frag_in = frag_in - frag_out;
frag_out = Inv_scale;
Update the Accumulators
R_Acc = R_Acc + R * frag_out
R_out = R_Acc * Scale
tar_row = tar_row + 1
R_Acc = 0
9
10
The Scaler Performance Issue
• The output pixel generation cycle does not
necessarily imply that the input pixel is consumed
during and output pixel generation.
• The algorithm retains the current input pixel used in
producing the output pixel as long as it is not
entirely consumed in an input consumption cycle.
• In Decimation case, that could result in the same
input pixel retained for more than one cycle which
hinders the performance.
11
What condition need to be detected for Speed-up
• At the output pixel generation cycle
• If the next cycle will be an input cycle, where the
input pixel fragment will be completely consumed
• This is has higher probability in Decimation.
• It is in fact certain to occur in Decimation, as the nature of
decimation requires more than one input pixel to contribute
to an output pixel
• The next input consumption cycle can be merged
with the current output pixel generation cycle.
• This implies that the current input pixel, is going to
be consumed in the output pixel generation cycle and
save an input pixel consumption cycle.
12
What condition need to be detected for Speed-up
• At the output pixel generation cycle
• As this is an output cycle we have:
• frag_out(curr) < frag_in(curr)
• As we are detecting an input consumption cycle next
• We have for the next cycle
• frag_in(next) < frag_out(next)
• frag_out(next) = Inv_scale
• The frag_in(next) is computed as
• frag_in(next) = frag_in(curr) – frag_out(curr)
• frag_in(next) >= 0 since this is an output cycle.
13
What condition need to be detected for Speed-up
• At the output pixel generation cycle
• As this is an output cycle we have:
• frag_out(curr) < frag_in(curr)
• Detect the following condition
• frag_in(curr) – frag_out(curr) < Inv_scale
• What is available at the input pixel fragment is less than what
is needed to produce an output pixel
Merged Output Pixel Prod/Input consumptionCycle
Condition for Output Pixel Production Cycle
Frag_out < Frag_in
Update the Accumulators
R_Acc = R_Acc + R * frag_out
R_out = R_Acc * Scale
tar_row = tar_row + 1 (produce o/p pixel)
If (frag_in(curr) – frag_out(curr) < Inv_scale) {
frag_in(next) = frag_in(curr) – frag_out(curr)
R_acc = frag_in(next) * R
frag_out(next) = Inv_scale – frag_in(next)
frag_in(next) = 1.0
src_row = src_row+1 (fetch a nex input pixel) }
else { normal output cycle}
14

Windows_Scaling_2X_Speedup

  • 1.
    Speed-up Windows Imagescalar algorithm Shereef Shehata
  • 2.
    2 Scaler Algorithm Parameters •Input_fragment  frag_in • Represents the portion of input pixel available to contribute towards the production of the output pixel • Output_fragment  frag_out • Represent the portion of the input pixel needed to complete the production of an output pixel • Accumulator  Acc • Represent the temporary storage of the input pixel contribution • Scale Factor  Scale • Inverse Scale Factor  Inv_scale
  • 3.
    Scaler Algorithm Initialization Frag_in= In_frag_input frag_out = 1/S = Inv_scale Acc = 0 3
  • 4.
    4 Scaler Algorithm: InputPixel Consumption Cycle • Required Condition to be an Input cycle • frag_in < frag_out • frag_in < frag_out • The condition for an input pixel consumption cycle is that the available input pixel fragment, frag_in, is less than the required input fragment to produce an output pixel, frag_out
  • 5.
    5 Input Pixel ConsumptionCycle • The input pixel will be used up • src_row or src_col is incremented • The current input pixel value is multiplied by frag_in and added to the accumulator • Frag_out is updated to reflect the value of frag_in consumed • The frag_in value is initialized to 1.0 to imply the fetch of a new input pixel. • The next input pixel is fetched • The algorithm returns to the comparison state between frag_in and frag_out
  • 6.
    Input Pixel ConsumptionCycle Condition for Input Pixel Consumption Cycle Frag_in < Frag_out Update frag_out and Initizalize frag_int to reflect Input Pixel Consumption frag_out = frag_out - frag_in; frag_in = 1.0; Update the Accumulators R_Acc = R_Acc + R * frag_in Fetch a new input Pixel src_row = src_row + 1 6
  • 7.
    7 Scaler Algorithm: OutputPixel Production Cycle • Required Condition to be an Output cycle • frag_out < frag_in • frag_out < frag_in • The condition for an output pixel production cycle is that the required input pixel fragment, frag_out is less than the available input fragment to produce an output pixel, frag_in
  • 8.
    8 Output Pixel ProductionCycle • The output pixel will be completed • The current input pixel value is multiplied by frag_out and added to the accumulator • Frag_in is updated to reflect the value of frag_out consumed • The frag_out value is initialized to Inv_scale to imply the start of the computation of a new output pixel. • The contents of the accumulator is scaled by Scale to produce the output pixel. • The Accumulator is zeroed out • The algorithm returns to the comparison state between frag_in and frag_out
  • 9.
    Output Pixel ProductionCycle Condition for Output Pixel Production Cycle Frag_out < Frag_in Update frag_in and Initizalize frag_out to reflect Output Pixel Generation frag_in = frag_in - frag_out; frag_out = Inv_scale; Update the Accumulators R_Acc = R_Acc + R * frag_out R_out = R_Acc * Scale tar_row = tar_row + 1 R_Acc = 0 9
  • 10.
    10 The Scaler PerformanceIssue • The output pixel generation cycle does not necessarily imply that the input pixel is consumed during and output pixel generation. • The algorithm retains the current input pixel used in producing the output pixel as long as it is not entirely consumed in an input consumption cycle. • In Decimation case, that could result in the same input pixel retained for more than one cycle which hinders the performance.
  • 11.
    11 What condition needto be detected for Speed-up • At the output pixel generation cycle • If the next cycle will be an input cycle, where the input pixel fragment will be completely consumed • This is has higher probability in Decimation. • It is in fact certain to occur in Decimation, as the nature of decimation requires more than one input pixel to contribute to an output pixel • The next input consumption cycle can be merged with the current output pixel generation cycle. • This implies that the current input pixel, is going to be consumed in the output pixel generation cycle and save an input pixel consumption cycle.
  • 12.
    12 What condition needto be detected for Speed-up • At the output pixel generation cycle • As this is an output cycle we have: • frag_out(curr) < frag_in(curr) • As we are detecting an input consumption cycle next • We have for the next cycle • frag_in(next) < frag_out(next) • frag_out(next) = Inv_scale • The frag_in(next) is computed as • frag_in(next) = frag_in(curr) – frag_out(curr) • frag_in(next) >= 0 since this is an output cycle.
  • 13.
    13 What condition needto be detected for Speed-up • At the output pixel generation cycle • As this is an output cycle we have: • frag_out(curr) < frag_in(curr) • Detect the following condition • frag_in(curr) – frag_out(curr) < Inv_scale • What is available at the input pixel fragment is less than what is needed to produce an output pixel
  • 14.
    Merged Output PixelProd/Input consumptionCycle Condition for Output Pixel Production Cycle Frag_out < Frag_in Update the Accumulators R_Acc = R_Acc + R * frag_out R_out = R_Acc * Scale tar_row = tar_row + 1 (produce o/p pixel) If (frag_in(curr) – frag_out(curr) < Inv_scale) { frag_in(next) = frag_in(curr) – frag_out(curr) R_acc = frag_in(next) * R frag_out(next) = Inv_scale – frag_in(next) frag_in(next) = 1.0 src_row = src_row+1 (fetch a nex input pixel) } else { normal output cycle} 14