SlideShare a Scribd company logo
Discrete Wavelet Transform (DWT)
L
H
=
G1
(o)
G1
(e)
G0
(o)
G0
(e)
L
H
L
H
=
k
1 U(k)
0 1
1 0
P(k) 1
L
H
2-D DWT
x = LL HL LH HH
T
y = LL HL LH HH
T
Existing Approach: Separable Convolution
y = NV
NH
x
Existing Approach: Separable Lifting
TH
P
SH
U
TV
P SV
U
y = SV
U SH
U TV
P TH
P x
Our Approach: Non-Separable Lifting
TH
P0
TV
P0
SH
U0
SV
U0
TP1
SU1
y = SV
U0
SH
U0
SU1
TV
P0
TH
P0
TP1
x
Architectures
pixel shader OpenCL/CUDA CPU
input/output off chip off chip off chip
intermediate results off chip on chip on chip
— in registers no yes no
on chip memory no 32–96 KiB 3–35 MiB
concurrent threads thousands thousands 2–112
4-tuples / thread 1 1–4 thousands
view global block-based block-based
block size global 642 5122
Results
CDF 9/7 Wavelet
0
10
20
30
40
50
60
70
80
90
100kpel 1Mpel 10Mpel 100Mpel
GB/s
OpenCL (AMD 6970)
0
5
10
15
20
25
30
35
100kpel 1Mpel 10Mpel 100Mpel
GB/s
Pixel Shader (NVIDIA Titan X)
separable lifting
separable polyconvolution
separable convolution
non-separable lifting
non-separable polyconvolution
non-separable convolution
Future Work
CPU
blade055 2 sockets × 28 cores = 56 CPUs
Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
cache: 32k/256k/35M
vidte 2 sockets × 12 cores = 24 CPUs
Intel(R) Xeon(R) CPU X5680 @ 3.33GHz
cache: 32k/256k/12M
UV2000 14 sockets × 8 cores = 112 cores
Intel(R) Xeon(R) CPU E5-4627 v2 @ 3.30GHz
cache: 32k/256k/16M

More Related Content

What's hot

Power spectral density
Power spectral densityPower spectral density
Power spectral density
丁興 鄭丁興
 
Turing Machine
Turing MachineTuring Machine
Turing Machine
AniketKandara1
 
Jitsi Videobridge, Octopodes, and Kotlin
Jitsi Videobridge, Octopodes, and KotlinJitsi Videobridge, Octopodes, and Kotlin
Jitsi Videobridge, Octopodes, and Kotlin
Boris Grozev
 
Turing machine
Turing machineTuring machine
Turing machine
Kanis Fatema Shanta
 
Unit 1
Unit 1Unit 1
Unit 1
Asif Iqbal
 
Axes Tech
Axes TechAxes Tech
Axes Tech
ncct
 
Tpr star tree
Tpr star treeTpr star tree
Tpr star treeWin Yu
 
Graph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized versionGraph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized version
Anton Korzh
 
LHC limits on the Higgs-portal WIMPs
LHC limits on the Higgs-portal WIMPsLHC limits on the Higgs-portal WIMPs
LHC limits on the Higgs-portal WIMPs
Yoshitaro Takaesu
 
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
Casiano Rodriguez-leon
 
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
Casiano Rodriguez-leon
 
Turing Machine
Turing MachineTuring Machine
Turing Machine
Rajendran
 
An evaluation of LLVM compiler for SVE with fairly complicated loops
An evaluation of LLVM compiler for SVE with fairly complicated loopsAn evaluation of LLVM compiler for SVE with fairly complicated loops
An evaluation of LLVM compiler for SVE with fairly complicated loops
Linaro
 
Block Cipher vs. Stream Cipher
Block Cipher vs. Stream CipherBlock Cipher vs. Stream Cipher
Block Cipher vs. Stream Cipher
Amirul Wiramuda
 
5 stream ciphers
5 stream ciphers5 stream ciphers
5 stream ciphersHarish Sahu
 
All Pairs-Shortest Path (Fast Floyd-Warshall) Code
All Pairs-Shortest Path (Fast Floyd-Warshall) Code All Pairs-Shortest Path (Fast Floyd-Warshall) Code
All Pairs-Shortest Path (Fast Floyd-Warshall) Code
Ehsan Sharifi
 
Introduction to Turing Machine
Introduction to Turing MachineIntroduction to Turing Machine
Introduction to Turing Machine
Muhammad SiRaj Munir
 
Variants of Turing Machine
Variants of Turing MachineVariants of Turing Machine
Variants of Turing Machine
Rajendran
 
[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation
[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation
[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation
Hayahide Yamagishi
 

What's hot (20)

Filter_Designs
Filter_DesignsFilter_Designs
Filter_Designs
 
Power spectral density
Power spectral densityPower spectral density
Power spectral density
 
Turing Machine
Turing MachineTuring Machine
Turing Machine
 
Jitsi Videobridge, Octopodes, and Kotlin
Jitsi Videobridge, Octopodes, and KotlinJitsi Videobridge, Octopodes, and Kotlin
Jitsi Videobridge, Octopodes, and Kotlin
 
Turing machine
Turing machineTuring machine
Turing machine
 
Unit 1
Unit 1Unit 1
Unit 1
 
Axes Tech
Axes TechAxes Tech
Axes Tech
 
Tpr star tree
Tpr star treeTpr star tree
Tpr star tree
 
Graph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized versionGraph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized version
 
LHC limits on the Higgs-portal WIMPs
LHC limits on the Higgs-portal WIMPsLHC limits on the Higgs-portal WIMPs
LHC limits on the Higgs-portal WIMPs
 
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
 
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
PREDICTING THE TIME OF OBLIVIOUS PROGRAMS. Euromicro 2001
 
Turing Machine
Turing MachineTuring Machine
Turing Machine
 
An evaluation of LLVM compiler for SVE with fairly complicated loops
An evaluation of LLVM compiler for SVE with fairly complicated loopsAn evaluation of LLVM compiler for SVE with fairly complicated loops
An evaluation of LLVM compiler for SVE with fairly complicated loops
 
Block Cipher vs. Stream Cipher
Block Cipher vs. Stream CipherBlock Cipher vs. Stream Cipher
Block Cipher vs. Stream Cipher
 
5 stream ciphers
5 stream ciphers5 stream ciphers
5 stream ciphers
 
All Pairs-Shortest Path (Fast Floyd-Warshall) Code
All Pairs-Shortest Path (Fast Floyd-Warshall) Code All Pairs-Shortest Path (Fast Floyd-Warshall) Code
All Pairs-Shortest Path (Fast Floyd-Warshall) Code
 
Introduction to Turing Machine
Introduction to Turing MachineIntroduction to Turing Machine
Introduction to Turing Machine
 
Variants of Turing Machine
Variants of Turing MachineVariants of Turing Machine
Variants of Turing Machine
 
[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation
[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation
[論文読み会資料] Asynchronous Bidirectional Decoding for Neural Machine Translation
 

Similar to Parallel Implementation of the 2-D Discrete Wavelet Transform

Discrete Wavelet Transforms on Parallel Architectures
Discrete Wavelet Transforms on Parallel ArchitecturesDiscrete Wavelet Transforms on Parallel Architectures
Discrete Wavelet Transforms on Parallel Architectures
David Bařina
 
Case Study (All)
Case Study (All)Case Study (All)
Case Study (All)gudeyi
 
sp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptx
sp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptxsp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptx
sp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptx
Elisée Ndjabu
 
Week3 ap3421 2019_part1
Week3 ap3421 2019_part1Week3 ap3421 2019_part1
Week3 ap3421 2019_part1
David Cian
 
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Hsien-Hsin Sean Lee, Ph.D.
 
Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2
HIMANSHU DIWAKAR
 
Correlative level coding
Correlative level codingCorrelative level coding
Correlative level codingsrkrishna341
 
Directed decision phase locked loop
Directed decision phase locked loopDirected decision phase locked loop
Directed decision phase locked loop
Hitham Jleed
 
Dsp U Lec09 Iir Filter Design
Dsp U   Lec09 Iir Filter DesignDsp U   Lec09 Iir Filter Design
Dsp U Lec09 Iir Filter Design
taha25
 
Slide11 icc2015
Slide11 icc2015Slide11 icc2015
Slide11 icc2015
T. E. BOGALE
 
Quantum logic synthesis (srikanth)
Quantum logic synthesis (srikanth)Quantum logic synthesis (srikanth)
Quantum logic synthesis (srikanth)bitra11
 
Bode plots-Lecture 1.ppt
Bode plots-Lecture 1.pptBode plots-Lecture 1.ppt
Bode plots-Lecture 1.ppt
sbajfhsakf
 
Automata theory - Push Down Automata (PDA)
Automata theory - Push Down Automata (PDA)Automata theory - Push Down Automata (PDA)
Automata theory - Push Down Automata (PDA)
Akila Krishnamoorthy
 
Fourier analysis of signals and systems
Fourier analysis of signals and systemsFourier analysis of signals and systems
Fourier analysis of signals and systems
Babul Islam
 
Parallel Wavelet Schemes for Images
Parallel Wavelet Schemes for ImagesParallel Wavelet Schemes for Images
Parallel Wavelet Schemes for Images
David Bařina
 
DSP-UNIT-V-PPT-1.pptx
DSP-UNIT-V-PPT-1.pptxDSP-UNIT-V-PPT-1.pptx
DSP-UNIT-V-PPT-1.pptx
praneethnatarajan
 

Similar to Parallel Implementation of the 2-D Discrete Wavelet Transform (20)

Discrete Wavelet Transforms on Parallel Architectures
Discrete Wavelet Transforms on Parallel ArchitecturesDiscrete Wavelet Transforms on Parallel Architectures
Discrete Wavelet Transforms on Parallel Architectures
 
Case Study (All)
Case Study (All)Case Study (All)
Case Study (All)
 
sp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptx
sp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptxsp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptx
sp12Part2 CIRCUITS AND SYSTEMS FOR COMPUTER ENGINEERING .pptx
 
Week3 ap3421 2019_part1
Week3 ap3421 2019_part1Week3 ap3421 2019_part1
Week3 ap3421 2019_part1
 
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
 
Dsp manual print
Dsp manual printDsp manual print
Dsp manual print
 
Thesis_Presentation
Thesis_PresentationThesis_Presentation
Thesis_Presentation
 
Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2
 
Correlative level coding
Correlative level codingCorrelative level coding
Correlative level coding
 
Directed decision phase locked loop
Directed decision phase locked loopDirected decision phase locked loop
Directed decision phase locked loop
 
Dsp U Lec09 Iir Filter Design
Dsp U   Lec09 Iir Filter DesignDsp U   Lec09 Iir Filter Design
Dsp U Lec09 Iir Filter Design
 
Slide11 icc2015
Slide11 icc2015Slide11 icc2015
Slide11 icc2015
 
Quantum logic synthesis (srikanth)
Quantum logic synthesis (srikanth)Quantum logic synthesis (srikanth)
Quantum logic synthesis (srikanth)
 
Bode plots-Lecture 1.ppt
Bode plots-Lecture 1.pptBode plots-Lecture 1.ppt
Bode plots-Lecture 1.ppt
 
Automata theory - Push Down Automata (PDA)
Automata theory - Push Down Automata (PDA)Automata theory - Push Down Automata (PDA)
Automata theory - Push Down Automata (PDA)
 
Fourier analysis of signals and systems
Fourier analysis of signals and systemsFourier analysis of signals and systems
Fourier analysis of signals and systems
 
Parallel Wavelet Schemes for Images
Parallel Wavelet Schemes for ImagesParallel Wavelet Schemes for Images
Parallel Wavelet Schemes for Images
 
DSP-UNIT-V-PPT-1.pptx
DSP-UNIT-V-PPT-1.pptxDSP-UNIT-V-PPT-1.pptx
DSP-UNIT-V-PPT-1.pptx
 
International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)
 
IMT Advanced
IMT AdvancedIMT Advanced
IMT Advanced
 

More from David Bařina

CCSDS 122.0
CCSDS 122.0CCSDS 122.0
CCSDS 122.0
David Bařina
 
Lossy Light Field Compression
Lossy Light Field CompressionLossy Light Field Compression
Lossy Light Field Compression
David Bařina
 
Mathematical curiosities
Mathematical curiositiesMathematical curiosities
Mathematical curiosities
David Bařina
 
C/C++ tricks
C/C++ tricksC/C++ tricks
C/C++ tricks
David Bařina
 
New Transforms for JPEG Format
New Transforms for JPEG FormatNew Transforms for JPEG Format
New Transforms for JPEG Format
David Bařina
 
JPEG
JPEGJPEG
Single-Loop Software Architecture for JPEG 2000
Single-Loop Software Architecture for JPEG 2000Single-Loop Software Architecture for JPEG 2000
Single-Loop Software Architecture for JPEG 2000
David Bařina
 
Lifting Scheme Cores for Wavelet Transform
Lifting Scheme Cores for Wavelet TransformLifting Scheme Cores for Wavelet Transform
Lifting Scheme Cores for Wavelet Transform
David Bařina
 
Real-Time 3-D Wavelet Lifting
Real-Time 3-D Wavelet LiftingReal-Time 3-D Wavelet Lifting
Real-Time 3-D Wavelet Lifting
David Bařina
 
Wavelet News
Wavelet NewsWavelet News
Wavelet News
David Bařina
 
IIR aproximace Gaussovy funkce
IIR aproximace Gaussovy funkceIIR aproximace Gaussovy funkce
IIR aproximace Gaussovy funkce
David Bařina
 
Akcelerace DWT pomocí SIMD
Akcelerace DWT pomocí SIMDAkcelerace DWT pomocí SIMD
Akcelerace DWT pomocí SIMD
David Bařina
 
Wavelet Lifting on Application Specific Vector Processor
Wavelet Lifting on Application Specific Vector ProcessorWavelet Lifting on Application Specific Vector Processor
Wavelet Lifting on Application Specific Vector Processor
David Bařina
 
GStreamer
GStreamerGStreamer
GStreamer
David Bařina
 
FFmpeg
FFmpegFFmpeg
Bit Twiddling Hacks: Integers
Bit Twiddling Hacks: IntegersBit Twiddling Hacks: Integers
Bit Twiddling Hacks: Integers
David Bařina
 
Fixed-point arithmetic
Fixed-point arithmeticFixed-point arithmetic
Fixed-point arithmetic
David Bařina
 
Wavelets @ CPU
Wavelets @ CPUWavelets @ CPU
Wavelets @ CPU
David Bařina
 

More from David Bařina (18)

CCSDS 122.0
CCSDS 122.0CCSDS 122.0
CCSDS 122.0
 
Lossy Light Field Compression
Lossy Light Field CompressionLossy Light Field Compression
Lossy Light Field Compression
 
Mathematical curiosities
Mathematical curiositiesMathematical curiosities
Mathematical curiosities
 
C/C++ tricks
C/C++ tricksC/C++ tricks
C/C++ tricks
 
New Transforms for JPEG Format
New Transforms for JPEG FormatNew Transforms for JPEG Format
New Transforms for JPEG Format
 
JPEG
JPEGJPEG
JPEG
 
Single-Loop Software Architecture for JPEG 2000
Single-Loop Software Architecture for JPEG 2000Single-Loop Software Architecture for JPEG 2000
Single-Loop Software Architecture for JPEG 2000
 
Lifting Scheme Cores for Wavelet Transform
Lifting Scheme Cores for Wavelet TransformLifting Scheme Cores for Wavelet Transform
Lifting Scheme Cores for Wavelet Transform
 
Real-Time 3-D Wavelet Lifting
Real-Time 3-D Wavelet LiftingReal-Time 3-D Wavelet Lifting
Real-Time 3-D Wavelet Lifting
 
Wavelet News
Wavelet NewsWavelet News
Wavelet News
 
IIR aproximace Gaussovy funkce
IIR aproximace Gaussovy funkceIIR aproximace Gaussovy funkce
IIR aproximace Gaussovy funkce
 
Akcelerace DWT pomocí SIMD
Akcelerace DWT pomocí SIMDAkcelerace DWT pomocí SIMD
Akcelerace DWT pomocí SIMD
 
Wavelet Lifting on Application Specific Vector Processor
Wavelet Lifting on Application Specific Vector ProcessorWavelet Lifting on Application Specific Vector Processor
Wavelet Lifting on Application Specific Vector Processor
 
GStreamer
GStreamerGStreamer
GStreamer
 
FFmpeg
FFmpegFFmpeg
FFmpeg
 
Bit Twiddling Hacks: Integers
Bit Twiddling Hacks: IntegersBit Twiddling Hacks: Integers
Bit Twiddling Hacks: Integers
 
Fixed-point arithmetic
Fixed-point arithmeticFixed-point arithmetic
Fixed-point arithmetic
 
Wavelets @ CPU
Wavelets @ CPUWavelets @ CPU
Wavelets @ CPU
 

Recently uploaded

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
ShahidSultan24
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
abh.arya
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 

Recently uploaded (20)

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 

Parallel Implementation of the 2-D Discrete Wavelet Transform

  • 1.
  • 2. Discrete Wavelet Transform (DWT) L H = G1 (o) G1 (e) G0 (o) G0 (e) L H L H = k 1 U(k) 0 1 1 0 P(k) 1 L H
  • 3. 2-D DWT x = LL HL LH HH T y = LL HL LH HH T
  • 4. Existing Approach: Separable Convolution y = NV NH x
  • 5. Existing Approach: Separable Lifting TH P SH U TV P SV U y = SV U SH U TV P TH P x
  • 6. Our Approach: Non-Separable Lifting TH P0 TV P0 SH U0 SV U0 TP1 SU1 y = SV U0 SH U0 SU1 TV P0 TH P0 TP1 x
  • 7. Architectures pixel shader OpenCL/CUDA CPU input/output off chip off chip off chip intermediate results off chip on chip on chip — in registers no yes no on chip memory no 32–96 KiB 3–35 MiB concurrent threads thousands thousands 2–112 4-tuples / thread 1 1–4 thousands view global block-based block-based block size global 642 5122
  • 8. Results CDF 9/7 Wavelet 0 10 20 30 40 50 60 70 80 90 100kpel 1Mpel 10Mpel 100Mpel GB/s OpenCL (AMD 6970) 0 5 10 15 20 25 30 35 100kpel 1Mpel 10Mpel 100Mpel GB/s Pixel Shader (NVIDIA Titan X) separable lifting separable polyconvolution separable convolution non-separable lifting non-separable polyconvolution non-separable convolution
  • 9. Future Work CPU blade055 2 sockets × 28 cores = 56 CPUs Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz cache: 32k/256k/35M vidte 2 sockets × 12 cores = 24 CPUs Intel(R) Xeon(R) CPU X5680 @ 3.33GHz cache: 32k/256k/12M UV2000 14 sockets × 8 cores = 112 cores Intel(R) Xeon(R) CPU E5-4627 v2 @ 3.30GHz cache: 32k/256k/16M