SlideShare a Scribd company logo
1 of 21
Download to read offline
A POWER EFFICIENT ARCHITECTURE
FOR 2-D DISCRETE WAVELET
TRANSFORM

                 Rahul Jain, CoWare India
            Preeti Ranjan Panda, IIT-Delhi
Agenda
    Memory Power Optimization
    Existing Z-Scan based Schemes
    Low Power Z-Scan (Proposed Architecture )
    Results
    Conclusion




10 August 2006   10th IEEE VLSI Design And Test   2
                        Symposium, 2006
Memory Power Optimization
   Importance of Optimizing Memory System Energy
        Many emerging applications like JPEG2000 are data
        intensive
        Memory system can contribute up to 90% energy
   Concurrently Optimizing Memory Architecture and
   Accesses
        Algorithm Level
             Reduce memory requirement
             Improve regularity of accesses
        Build optimized memory architecture
             Memory Partitioning
             Custom Circuits



10 August 2006           10th IEEE VLSI Design And Test     3
                                Symposium, 2006
Z-Scan based Schemes                               [Chiu-SIPS’03]

   Suspending a DWT line computation
      Store 4 intermediate values
   Z-Scan
      Column Processing starts early
      On-Chip Buffer Required = 4*M
                            M =Image Tile ht                        2* CH



   Optimal Z-Scan                           2* CW
     EBCOT Code-Block size (CW*CH) considered
     On-Chip Buffer Required = 4*M+4*2*CW
     Usually CW=CH=64 (values used in exp.)
10 August 2006    10th IEEE VLSI Design And Test                4
                         Symposium, 2006
Low-Power Z-Scan (1)
    Generalize the Z-Scan
    Compute r elements in a row
    For Z Scan, r =2
    For Optimal Z-Scan, r = 2*CW
    On-Chip Buffer Required = 4*M+4*r
                                            r     r




                                2*CH




10 August 2006   10th IEEE VLSI Design And Test       5
                        Symposium, 2006
Low-Power Z-Scan (2)
    r will be a sub-integral multiple of 2*CW
         This considers the Code Block Size
    2 separate buffers used
         Row Buffer (RB) = 4*M
         Column Buffer (CB) = 4*r
    How to decide the value of r ?
         Size of CB α r
         RB Sleep Time α r                 RB in Low Power Mode

                                                                  RB access


                                              CB: r locations



10 August 2006        10th IEEE VLSI Design And Test                      6
                             Symposium, 2006
Memory Power Analysis (1)
    Let us assume that each element is computed in
    unit time (Energy and Power can be used interchangeably)
    For a memory of size 2n, Let
       Pa(2n) : memory access power
       Ps(2n) : sleep mode / data retention mode power
       Pw(2n) : wakeup power for each state transition from
       sleep mode to active mode
    Let, Ps(2n) = s* Pa (2n) and Pw (2n) = w* Pa (2n)
       s = 0.1, w = 0.33 (Assumed for Experiments)
    Buffer Accesses
         Read at Resumption
         Write at Suspension

10 August 2006        10th IEEE VLSI Design And Test           7
                             Symposium, 2006
Memory Power Analysis (2)
    Row Buffer Power
         2 access per r elements
         RB in sleep mode for r-2 element computation
         Wakeup RB once per row
         Power per ‘r’ element computation:
         Prow_buffer (r, M) = 2* Pa(M) + (r-2) * Ps(M) + Pw(M)


                     RB in Low Power Mode
                                                 Wakeup


                  Row Computation Resumes
                        Row Computation Suspends


10 August 2006          10th IEEE VLSI Design And Test           8
                               Symposium, 2006
Memory Power Analysis (3)
    Column Buffer Power
         1 access per element
         Power consumption per element computation:
         Pcol_buffer (r) = Pa(r)
                  Col Computation Resumes



                  Col Computation Suspends

    Power per 2-D DWT Element Computation:
          Prow_buffer (r, M)/r + Pcol_buffer (r)


10 August 2006       10th IEEE VLSI Design And Test   9
                            Symposium, 2006
Variation of Power with r
                        6.00E-10




                        5.00E-10




                        4.00E-10
           Energy (J)




                                                                              M=512
                                                                              M=256
                        3.00E-10                                              M=128
                                                                              M=64
                                                          r=32                M=32


                        2.00E-10



                                                         r=16
                        1.00E-10




                        0.00E+00
                                   2     4     8    16     32    64     128




                                             Value of r
10 August 2006                         10th IEEE VLSI Design And Test                 10
                                              Symposium, 2006
Power Implications of Banking (1)
    Banked Buffer
         Increases the average idleness of the each buffer
         Lower Access Power
         Predictable state changes, no timing overheads
    Let there be ‘b’ RB banks and ‘c’ CB banks
    Average RB power per element:
    Prow = [Power of bank in use*M/b + Sleep Power*(M-M/b)] / M
          = [{Prow_buffer (r, M/b) / r} * M/b + Ps (M/b) * (M-M/b)] / M
    Each bank waked up once for M*r elements
         Additional Row Buffer Wakeups per Element = b/M*r


10 August 2006           10th IEEE VLSI Design And Test              11
                                Symposium, 2006
Power Implications of Banking (2)
    Average column-buffer power per element:
    Pcol = [{Pcol_buffer (r/c)} * r/c + Ps (r/c) * (r-r/c)] / r
    No of Column Buffer Wakeups per Element = c/r
    Additional Wakeup Power :
    Pwakeups = [Pw(M/b) * b/M*r ] + [ Pw(r/c) * c/r ]
    MUX power considered
    Total Power per Element :
    Prow + Pcol + Pwakeups + Pmux




10 August 2006       10th IEEE VLSI Design And Test          12
                            Symposium, 2006
r vs Power (Banked Case, M=512)




                                                  Min Power
                                                  with r=64,
                                                   c=4, b=8

10 August 2006   10th IEEE VLSI Design And Test        13
                        Symposium, 2006
Energy Consumption Comparison
                    Optimal    Low-Power
          Z-scan                                                      %
 M                   Z-scan       Z-scan                 r   c   b
         (10-11J)                                                    imp
                    (10-11J)     (10-11J)
 32        23.4      29.1            8.08            32      4   4   72.2
 64        25.5      29.3            8.13            64      4   4   72.3
128        29.9      29.7            8.18            64      4   8   72.5
256        38.5      30.6            8.29            64      4   8   72.9
512        55.8      32.3            8.49            64      4   8   73.7
1024       90.3      35.8            8.89            64      4   8   75.2

Up to 90% and 75% improvement over Z-Scan and Optimal
Z-Scan respectively

10 August 2006          10th IEEE VLSI Design And Test                  14
                               Symposium, 2006
Energy Modelling
  Sequential Access Memory [Moon-CICC’02]
    Configured as a circular buffer
    Address Sequencing logic and decoders replaced with
    row sequencer to get low power and high speed
    Banked implementation used for big memory
  Energy Modelling [Coumeri-TVLSI’00]
    Empirical Equations for modelling energy of on-chip
    SRAM memory
    Model parameters are Size, Bit Width, Access Mode
    Individual equations for different memory components
    To model SAM, Row Decoder, Column Decoder, Buffers
    not considered

10 August 2006     10th IEEE VLSI Design And Test      15
                          Symposium, 2006
Conclusion
    A methodology to arrive at a Low-Power
    DWT architecture proposed
    Co-Optimization of Memory Architecture
    and Access pattern done
    Up to 90% energy saving achieved
    The derived architecture depends on the
    target memory technology
         Would lead to different architectures for ASIC
         and FPGA implementations


10 August 2006       10th IEEE VLSI Design And Test       16
                            Symposium, 2006
References:
   [Chiu-SIPS’03]: Mu-Yu Chiu et al (2003).Optimal data
   transfer and buffering schemes for JPEG2000 encode.
   IEEE Workshop on SIPS, Aug. 2003, pp. 177 – 182
   [Moon-CICC’02]: Joong-Seok Moon et.al (2002). Low-
   power sequential access memory design. Custom
   Integrated Circuits Conference, 2002. pp.111 – 114
   [Coumeri-TVLSI’00]: Coumeri, S.L et al (2000).
   Memory modelling for System Synthesis. IEEE Trans.
   VLSI Systems, , June 2000, pp:327 – 334




10 August 2006     10th IEEE VLSI Design And Test     17
                          Symposium, 2006
Thank You


                 Questions!




10 August 2006    10th IEEE VLSI Design And Test   18
                         Symposium, 2006
Backup Slides
Discrete Wavelet Transform
        2D wavelet transform:
             1st:1D wavelet transform to all rows
             2nd:1D wavelet transform to all columns
        Each Row/Column can be computed independently
        Store 4 values at line computation suspension
   0     1     2   3   4   5      6    7    8         X(i)



         1         3       5           7            Y(2i+1)
                                                                Colored arrows show
                                                                multiplication by
    0          2       4          6          8        Y(2i)     constants a, b, c, d
                                                                defined in JPEG2000
         1                             7              Z(2i+1)   standard
                   3       5

    0          2                             8        Z(2i)
                       4          6
10 August 2006                 10th IEEE VLSI Design And Test                     20
                                      Symposium, 2006
Buffer Structure
    The Buffers are all the time full
    They are accessed like a circular FIFO
    General Memory Row Decoder not required
         use a counter
         use a shift register loaded with a 1 initially
    Every Write Signal
         Increments the counter
         Shifts the Register
    Store all the 4 intermediate values in one Column
         No need for the Column Decoder
    This would be similar to Sequential Access Memory
    (SAM) [Moon-CICC’02]

10 August 2006          10th IEEE VLSI Design And Test    21
                               Symposium, 2006

More Related Content

What's hot

Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)Jonathan Skelton
 
Ch4 lecture slides Chenming Hu Device for IC
Ch4 lecture slides Chenming Hu Device for ICCh4 lecture slides Chenming Hu Device for IC
Ch4 lecture slides Chenming Hu Device for ICChenming Hu
 
Ultrasound Modular Architecture
Ultrasound Modular ArchitectureUltrasound Modular Architecture
Ultrasound Modular ArchitectureJose Miguel Moreno
 
Ch5 lecture slides Chenming Hu Device for IC
Ch5 lecture slides Chenming Hu Device for ICCh5 lecture slides Chenming Hu Device for IC
Ch5 lecture slides Chenming Hu Device for ICChenming Hu
 
Orthogonal Faster than Nyquist Transmission for SIMO Wireless Systems
Orthogonal Faster than Nyquist Transmission for SIMO Wireless SystemsOrthogonal Faster than Nyquist Transmission for SIMO Wireless Systems
Orthogonal Faster than Nyquist Transmission for SIMO Wireless SystemsT. E. BOGALE
 
Ch3 lecture slides Chenming Hu Device for IC
Ch3 lecture slides Chenming Hu Device for ICCh3 lecture slides Chenming Hu Device for IC
Ch3 lecture slides Chenming Hu Device for ICChenming Hu
 
Ch7 lecture slides Chenming Hu Device for IC
Ch7 lecture slides Chenming Hu Device for ICCh7 lecture slides Chenming Hu Device for IC
Ch7 lecture slides Chenming Hu Device for ICChenming Hu
 
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...Takahiro Katagiri
 
Battle field3 ssao
Battle field3 ssaoBattle field3 ssao
Battle field3 ssaoMinGeun Park
 
Vortex Dissipation Due to Airfoil-Vortex Interaction
Vortex Dissipation Due to Airfoil-Vortex InteractionVortex Dissipation Due to Airfoil-Vortex Interaction
Vortex Dissipation Due to Airfoil-Vortex InteractionMasahiro Kanazaki
 
Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...
Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...
Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...Masahiro Kanazaki
 
Multiband Transceivers - [Chapter 1]
Multiband Transceivers - [Chapter 1] Multiband Transceivers - [Chapter 1]
Multiband Transceivers - [Chapter 1] Simen Li
 
An Enhanced Inherited Crossover GA for the Reliability Constrained UC Problem
An Enhanced Inherited Crossover GA for the Reliability Constrained UC ProblemAn Enhanced Inherited Crossover GA for the Reliability Constrained UC Problem
An Enhanced Inherited Crossover GA for the Reliability Constrained UC ProblemIDES Editor
 
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering WorkflowTakahiro Harada
 
Lect2 up270 (100328)
Lect2 up270 (100328)Lect2 up270 (100328)
Lect2 up270 (100328)aicdesign
 

What's hot (19)

Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)
 
Ch4 lecture slides Chenming Hu Device for IC
Ch4 lecture slides Chenming Hu Device for ICCh4 lecture slides Chenming Hu Device for IC
Ch4 lecture slides Chenming Hu Device for IC
 
Ultrasound Modular Architecture
Ultrasound Modular ArchitectureUltrasound Modular Architecture
Ultrasound Modular Architecture
 
Ch5 lecture slides Chenming Hu Device for IC
Ch5 lecture slides Chenming Hu Device for ICCh5 lecture slides Chenming Hu Device for IC
Ch5 lecture slides Chenming Hu Device for IC
 
Orthogonal Faster than Nyquist Transmission for SIMO Wireless Systems
Orthogonal Faster than Nyquist Transmission for SIMO Wireless SystemsOrthogonal Faster than Nyquist Transmission for SIMO Wireless Systems
Orthogonal Faster than Nyquist Transmission for SIMO Wireless Systems
 
Ch3 lecture slides Chenming Hu Device for IC
Ch3 lecture slides Chenming Hu Device for ICCh3 lecture slides Chenming Hu Device for IC
Ch3 lecture slides Chenming Hu Device for IC
 
Ch7 lecture slides Chenming Hu Device for IC
Ch7 lecture slides Chenming Hu Device for ICCh7 lecture slides Chenming Hu Device for IC
Ch7 lecture slides Chenming Hu Device for IC
 
SAR_ADC__Resumo
SAR_ADC__ResumoSAR_ADC__Resumo
SAR_ADC__Resumo
 
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
 
Battle field3 ssao
Battle field3 ssaoBattle field3 ssao
Battle field3 ssao
 
Vortex Dissipation Due to Airfoil-Vortex Interaction
Vortex Dissipation Due to Airfoil-Vortex InteractionVortex Dissipation Due to Airfoil-Vortex Interaction
Vortex Dissipation Due to Airfoil-Vortex Interaction
 
Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...
Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...
Numerical Simulation: Flight Dynamic Stability Analysis Using Unstructured Ba...
 
Multiband Transceivers - [Chapter 1]
Multiband Transceivers - [Chapter 1] Multiband Transceivers - [Chapter 1]
Multiband Transceivers - [Chapter 1]
 
An Enhanced Inherited Crossover GA for the Reliability Constrained UC Problem
An Enhanced Inherited Crossover GA for the Reliability Constrained UC ProblemAn Enhanced Inherited Crossover GA for the Reliability Constrained UC Problem
An Enhanced Inherited Crossover GA for the Reliability Constrained UC Problem
 
US7522774
US7522774US7522774
US7522774
 
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
 
1147 smith[1]
1147 smith[1]1147 smith[1]
1147 smith[1]
 
Lect2 up270 (100328)
Lect2 up270 (100328)Lect2 up270 (100328)
Lect2 up270 (100328)
 
D1150740001
D1150740001D1150740001
D1150740001
 

Viewers also liked

Low Power Architecture for JPEG2000
Low Power Architecture for JPEG2000Low Power Architecture for JPEG2000
Low Power Architecture for JPEG2000Rahul Jain
 
Passive Low Energy Architecture Conference Paper 2009
Passive Low Energy Architecture Conference Paper 2009Passive Low Energy Architecture Conference Paper 2009
Passive Low Energy Architecture Conference Paper 2009Farah Naz
 
Satellite Image Resolution Enhancement Technique Using DWT and IWT
Satellite Image Resolution Enhancement Technique Using DWT and IWTSatellite Image Resolution Enhancement Technique Using DWT and IWT
Satellite Image Resolution Enhancement Technique Using DWT and IWTEditor IJCATR
 
Energy Efficient Design Education Through Architectural Design Studio Projects
Energy Efficient Design Education Through Architectural Design Studio ProjectsEnergy Efficient Design Education Through Architectural Design Studio Projects
Energy Efficient Design Education Through Architectural Design Studio ProjectsKhaled Ali
 
Satellite image contrast enhancement using discrete wavelet transform
Satellite image contrast enhancement using discrete wavelet transformSatellite image contrast enhancement using discrete wavelet transform
Satellite image contrast enhancement using discrete wavelet transformHarishwar Reddy
 
Advanced architecture theory and criticism lecture 01
Advanced architecture theory and criticism lecture 01Advanced architecture theory and criticism lecture 01
Advanced architecture theory and criticism lecture 01Khaled Ali
 
Climate Responsive Architecture
Climate Responsive ArchitectureClimate Responsive Architecture
Climate Responsive ArchitectureDeepthi Deepu
 

Viewers also liked (10)

Low Power Architecture for JPEG2000
Low Power Architecture for JPEG2000Low Power Architecture for JPEG2000
Low Power Architecture for JPEG2000
 
Passive Low Energy Architecture Conference Paper 2009
Passive Low Energy Architecture Conference Paper 2009Passive Low Energy Architecture Conference Paper 2009
Passive Low Energy Architecture Conference Paper 2009
 
Low Energy Architecture: An Overview
Low Energy Architecture: An OverviewLow Energy Architecture: An Overview
Low Energy Architecture: An Overview
 
Satellite Image Resolution Enhancement Technique Using DWT and IWT
Satellite Image Resolution Enhancement Technique Using DWT and IWTSatellite Image Resolution Enhancement Technique Using DWT and IWT
Satellite Image Resolution Enhancement Technique Using DWT and IWT
 
Energy Efficient Architecture-Sustainable Habitat
Energy Efficient Architecture-Sustainable HabitatEnergy Efficient Architecture-Sustainable Habitat
Energy Efficient Architecture-Sustainable Habitat
 
Energy Efficient Design Education Through Architectural Design Studio Projects
Energy Efficient Design Education Through Architectural Design Studio ProjectsEnergy Efficient Design Education Through Architectural Design Studio Projects
Energy Efficient Design Education Through Architectural Design Studio Projects
 
Satellite image contrast enhancement using discrete wavelet transform
Satellite image contrast enhancement using discrete wavelet transformSatellite image contrast enhancement using discrete wavelet transform
Satellite image contrast enhancement using discrete wavelet transform
 
Advanced architecture theory and criticism lecture 01
Advanced architecture theory and criticism lecture 01Advanced architecture theory and criticism lecture 01
Advanced architecture theory and criticism lecture 01
 
Energy Efficient and sustainable Buildings
Energy Efficient  and sustainable BuildingsEnergy Efficient  and sustainable Buildings
Energy Efficient and sustainable Buildings
 
Climate Responsive Architecture
Climate Responsive ArchitectureClimate Responsive Architecture
Climate Responsive Architecture
 

Similar to LOW POWER Z-SCAN ARCHITECTURE FOR 2-D DWT

Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...
Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...
Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...IDES Editor
 
Minimum Cost Fault Tolerant Adder Circuits in Reversible Logic Synthesis
Minimum Cost Fault Tolerant Adder Circuits in Reversible Logic SynthesisMinimum Cost Fault Tolerant Adder Circuits in Reversible Logic Synthesis
Minimum Cost Fault Tolerant Adder Circuits in Reversible Logic SynthesisSajib Mitra
 
Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}
Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}
Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}guest3f9c6b
 
Position sensorless vector control of pmsm for electrical household applicances
Position sensorless vector control of pmsm for electrical household applicancesPosition sensorless vector control of pmsm for electrical household applicances
Position sensorless vector control of pmsm for electrical household applicanceswarluck88
 
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...IJERA Editor
 
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...IJERA Editor
 
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open CloudCoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open CloudAta Turk
 
Design of 5.1 GHz ultra-low power and wide tuning range hybrid oscillator
Design of 5.1 GHz ultra-low power and wide tuning range  hybrid oscillatorDesign of 5.1 GHz ultra-low power and wide tuning range  hybrid oscillator
Design of 5.1 GHz ultra-low power and wide tuning range hybrid oscillatorIJECEIAES
 
ALEA:Fine-grain Energy Profiling with Basic Block sampling
ALEA:Fine-grain Energy Profiling with Basic Block samplingALEA:Fine-grain Energy Profiling with Basic Block sampling
ALEA:Fine-grain Energy Profiling with Basic Block samplingLev Mukhanov
 
Low Power Clock Distribution Schemes in VLSI Design
Low Power Clock Distribution Schemes in VLSI DesignLow Power Clock Distribution Schemes in VLSI Design
Low Power Clock Distribution Schemes in VLSI DesignIJERA Editor
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...VLSICS Design
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...VLSICS Design
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...VLSICS Design
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...VLSICS Design
 

Similar to LOW POWER Z-SCAN ARCHITECTURE FOR 2-D DWT (20)

Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...
Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...
Optimal Capacitor Placement in a Radial Distribution System using Shuffled Fr...
 
Minimum Cost Fault Tolerant Adder Circuits in Reversible Logic Synthesis
Minimum Cost Fault Tolerant Adder Circuits in Reversible Logic SynthesisMinimum Cost Fault Tolerant Adder Circuits in Reversible Logic Synthesis
Minimum Cost Fault Tolerant Adder Circuits in Reversible Logic Synthesis
 
Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}
Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}
Linear and digital ic applications Jntu Model Paper{Www.Studentyogi.Com}
 
2nd Semester M Tech: VLSI Design and Embedded System (June-2016) Question Papers
2nd Semester M Tech: VLSI Design and Embedded System (June-2016) Question Papers2nd Semester M Tech: VLSI Design and Embedded System (June-2016) Question Papers
2nd Semester M Tech: VLSI Design and Embedded System (June-2016) Question Papers
 
Position sensorless vector control of pmsm for electrical household applicances
Position sensorless vector control of pmsm for electrical household applicancesPosition sensorless vector control of pmsm for electrical household applicances
Position sensorless vector control of pmsm for electrical household applicances
 
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
 
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
A Low Phase Noise CMOS Quadrature Voltage Control Oscillator Using Clock Gate...
 
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open CloudCoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
 
Design of 5.1 GHz ultra-low power and wide tuning range hybrid oscillator
Design of 5.1 GHz ultra-low power and wide tuning range  hybrid oscillatorDesign of 5.1 GHz ultra-low power and wide tuning range  hybrid oscillator
Design of 5.1 GHz ultra-low power and wide tuning range hybrid oscillator
 
Ijaerv10n9spl 473
Ijaerv10n9spl 473Ijaerv10n9spl 473
Ijaerv10n9spl 473
 
ijaerv10n9spl_473
ijaerv10n9spl_473ijaerv10n9spl_473
ijaerv10n9spl_473
 
M Tech New Syllabus(2012)
M Tech New Syllabus(2012)M Tech New Syllabus(2012)
M Tech New Syllabus(2012)
 
ALEA:Fine-grain Energy Profiling with Basic Block sampling
ALEA:Fine-grain Energy Profiling with Basic Block samplingALEA:Fine-grain Energy Profiling with Basic Block sampling
ALEA:Fine-grain Energy Profiling with Basic Block sampling
 
6th Semeste Electronics and Communication Engineering (June-2016) Question Pa...
6th Semeste Electronics and Communication Engineering (June-2016) Question Pa...6th Semeste Electronics and Communication Engineering (June-2016) Question Pa...
6th Semeste Electronics and Communication Engineering (June-2016) Question Pa...
 
Low Power Clock Distribution Schemes in VLSI Design
Low Power Clock Distribution Schemes in VLSI DesignLow Power Clock Distribution Schemes in VLSI Design
Low Power Clock Distribution Schemes in VLSI Design
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
SIMULTANEOUS OPTIMIZATION OF STANDBY AND ACTIVE ENERGY FOR SUB-THRESHOLD CIRC...
 
7th Semeste Electronics and Communication Engineering (June-2016) Question Pa...
7th Semeste Electronics and Communication Engineering (June-2016) Question Pa...7th Semeste Electronics and Communication Engineering (June-2016) Question Pa...
7th Semeste Electronics and Communication Engineering (June-2016) Question Pa...
 

Recently uploaded

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 

Recently uploaded (20)

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 

LOW POWER Z-SCAN ARCHITECTURE FOR 2-D DWT

  • 1. A POWER EFFICIENT ARCHITECTURE FOR 2-D DISCRETE WAVELET TRANSFORM Rahul Jain, CoWare India Preeti Ranjan Panda, IIT-Delhi
  • 2. Agenda Memory Power Optimization Existing Z-Scan based Schemes Low Power Z-Scan (Proposed Architecture ) Results Conclusion 10 August 2006 10th IEEE VLSI Design And Test 2 Symposium, 2006
  • 3. Memory Power Optimization Importance of Optimizing Memory System Energy Many emerging applications like JPEG2000 are data intensive Memory system can contribute up to 90% energy Concurrently Optimizing Memory Architecture and Accesses Algorithm Level Reduce memory requirement Improve regularity of accesses Build optimized memory architecture Memory Partitioning Custom Circuits 10 August 2006 10th IEEE VLSI Design And Test 3 Symposium, 2006
  • 4. Z-Scan based Schemes [Chiu-SIPS’03] Suspending a DWT line computation Store 4 intermediate values Z-Scan Column Processing starts early On-Chip Buffer Required = 4*M M =Image Tile ht 2* CH Optimal Z-Scan 2* CW EBCOT Code-Block size (CW*CH) considered On-Chip Buffer Required = 4*M+4*2*CW Usually CW=CH=64 (values used in exp.) 10 August 2006 10th IEEE VLSI Design And Test 4 Symposium, 2006
  • 5. Low-Power Z-Scan (1) Generalize the Z-Scan Compute r elements in a row For Z Scan, r =2 For Optimal Z-Scan, r = 2*CW On-Chip Buffer Required = 4*M+4*r r r 2*CH 10 August 2006 10th IEEE VLSI Design And Test 5 Symposium, 2006
  • 6. Low-Power Z-Scan (2) r will be a sub-integral multiple of 2*CW This considers the Code Block Size 2 separate buffers used Row Buffer (RB) = 4*M Column Buffer (CB) = 4*r How to decide the value of r ? Size of CB α r RB Sleep Time α r RB in Low Power Mode RB access CB: r locations 10 August 2006 10th IEEE VLSI Design And Test 6 Symposium, 2006
  • 7. Memory Power Analysis (1) Let us assume that each element is computed in unit time (Energy and Power can be used interchangeably) For a memory of size 2n, Let Pa(2n) : memory access power Ps(2n) : sleep mode / data retention mode power Pw(2n) : wakeup power for each state transition from sleep mode to active mode Let, Ps(2n) = s* Pa (2n) and Pw (2n) = w* Pa (2n) s = 0.1, w = 0.33 (Assumed for Experiments) Buffer Accesses Read at Resumption Write at Suspension 10 August 2006 10th IEEE VLSI Design And Test 7 Symposium, 2006
  • 8. Memory Power Analysis (2) Row Buffer Power 2 access per r elements RB in sleep mode for r-2 element computation Wakeup RB once per row Power per ‘r’ element computation: Prow_buffer (r, M) = 2* Pa(M) + (r-2) * Ps(M) + Pw(M) RB in Low Power Mode Wakeup Row Computation Resumes Row Computation Suspends 10 August 2006 10th IEEE VLSI Design And Test 8 Symposium, 2006
  • 9. Memory Power Analysis (3) Column Buffer Power 1 access per element Power consumption per element computation: Pcol_buffer (r) = Pa(r) Col Computation Resumes Col Computation Suspends Power per 2-D DWT Element Computation: Prow_buffer (r, M)/r + Pcol_buffer (r) 10 August 2006 10th IEEE VLSI Design And Test 9 Symposium, 2006
  • 10. Variation of Power with r 6.00E-10 5.00E-10 4.00E-10 Energy (J) M=512 M=256 3.00E-10 M=128 M=64 r=32 M=32 2.00E-10 r=16 1.00E-10 0.00E+00 2 4 8 16 32 64 128 Value of r 10 August 2006 10th IEEE VLSI Design And Test 10 Symposium, 2006
  • 11. Power Implications of Banking (1) Banked Buffer Increases the average idleness of the each buffer Lower Access Power Predictable state changes, no timing overheads Let there be ‘b’ RB banks and ‘c’ CB banks Average RB power per element: Prow = [Power of bank in use*M/b + Sleep Power*(M-M/b)] / M = [{Prow_buffer (r, M/b) / r} * M/b + Ps (M/b) * (M-M/b)] / M Each bank waked up once for M*r elements Additional Row Buffer Wakeups per Element = b/M*r 10 August 2006 10th IEEE VLSI Design And Test 11 Symposium, 2006
  • 12. Power Implications of Banking (2) Average column-buffer power per element: Pcol = [{Pcol_buffer (r/c)} * r/c + Ps (r/c) * (r-r/c)] / r No of Column Buffer Wakeups per Element = c/r Additional Wakeup Power : Pwakeups = [Pw(M/b) * b/M*r ] + [ Pw(r/c) * c/r ] MUX power considered Total Power per Element : Prow + Pcol + Pwakeups + Pmux 10 August 2006 10th IEEE VLSI Design And Test 12 Symposium, 2006
  • 13. r vs Power (Banked Case, M=512) Min Power with r=64, c=4, b=8 10 August 2006 10th IEEE VLSI Design And Test 13 Symposium, 2006
  • 14. Energy Consumption Comparison Optimal Low-Power Z-scan % M Z-scan Z-scan r c b (10-11J) imp (10-11J) (10-11J) 32 23.4 29.1 8.08 32 4 4 72.2 64 25.5 29.3 8.13 64 4 4 72.3 128 29.9 29.7 8.18 64 4 8 72.5 256 38.5 30.6 8.29 64 4 8 72.9 512 55.8 32.3 8.49 64 4 8 73.7 1024 90.3 35.8 8.89 64 4 8 75.2 Up to 90% and 75% improvement over Z-Scan and Optimal Z-Scan respectively 10 August 2006 10th IEEE VLSI Design And Test 14 Symposium, 2006
  • 15. Energy Modelling Sequential Access Memory [Moon-CICC’02] Configured as a circular buffer Address Sequencing logic and decoders replaced with row sequencer to get low power and high speed Banked implementation used for big memory Energy Modelling [Coumeri-TVLSI’00] Empirical Equations for modelling energy of on-chip SRAM memory Model parameters are Size, Bit Width, Access Mode Individual equations for different memory components To model SAM, Row Decoder, Column Decoder, Buffers not considered 10 August 2006 10th IEEE VLSI Design And Test 15 Symposium, 2006
  • 16. Conclusion A methodology to arrive at a Low-Power DWT architecture proposed Co-Optimization of Memory Architecture and Access pattern done Up to 90% energy saving achieved The derived architecture depends on the target memory technology Would lead to different architectures for ASIC and FPGA implementations 10 August 2006 10th IEEE VLSI Design And Test 16 Symposium, 2006
  • 17. References: [Chiu-SIPS’03]: Mu-Yu Chiu et al (2003).Optimal data transfer and buffering schemes for JPEG2000 encode. IEEE Workshop on SIPS, Aug. 2003, pp. 177 – 182 [Moon-CICC’02]: Joong-Seok Moon et.al (2002). Low- power sequential access memory design. Custom Integrated Circuits Conference, 2002. pp.111 – 114 [Coumeri-TVLSI’00]: Coumeri, S.L et al (2000). Memory modelling for System Synthesis. IEEE Trans. VLSI Systems, , June 2000, pp:327 – 334 10 August 2006 10th IEEE VLSI Design And Test 17 Symposium, 2006
  • 18. Thank You Questions! 10 August 2006 10th IEEE VLSI Design And Test 18 Symposium, 2006
  • 20. Discrete Wavelet Transform 2D wavelet transform: 1st:1D wavelet transform to all rows 2nd:1D wavelet transform to all columns Each Row/Column can be computed independently Store 4 values at line computation suspension 0 1 2 3 4 5 6 7 8 X(i) 1 3 5 7 Y(2i+1) Colored arrows show multiplication by 0 2 4 6 8 Y(2i) constants a, b, c, d defined in JPEG2000 1 7 Z(2i+1) standard 3 5 0 2 8 Z(2i) 4 6 10 August 2006 10th IEEE VLSI Design And Test 20 Symposium, 2006
  • 21. Buffer Structure The Buffers are all the time full They are accessed like a circular FIFO General Memory Row Decoder not required use a counter use a shift register loaded with a 1 initially Every Write Signal Increments the counter Shifts the Register Store all the 4 intermediate values in one Column No need for the Column Decoder This would be similar to Sequential Access Memory (SAM) [Moon-CICC’02] 10 August 2006 10th IEEE VLSI Design And Test 21 Symposium, 2006