Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
1. Large Scale Parallel FDTD Simulation of Full
3D Photonic Crystal Structures
J. S. Ayubi-Moak1, R. Akis1, G. Speyer,2 D.C. Stanzione3,
P. Sotirelis4 and S. M. Goodnick1
! ! 1Department of Electrical, Computer and Energy Engineering - Arizona State University
! ! 2High Performance Computing Initiative - Arizona State University
! ! 3Texas Advanced Computing Center (TACC) - University of Texas at Austin
! ! 4 AFRL - Wright Patterson AFB
1
3. Background: 3
Photonic Crystals
• Photonic crystals are periodic optical nanostructures that affect motion
of photons in a similar way that periodicity in a semiconductor crystal
affects the motion of electrons.
3
4. Background: 4
Photonic Crystals (examples)
1D
2D
3D
4
5. Motivation: 5
Photonic Crystals (optical circuits)
• Photonic crystals/PBM are promising
for true integrated optics.
• Waveguides with small bends
possible making compact integrated
photonic circuits (IPCs) achievable.
splitter
cavity
bend
resonator
http://photonics.tfp.uni-karlsruhe.de/research.html
http://pages.ief.u-psud.fr
5
6. Motivation: 6
3D Photonic Crystals
Integrated Opto-Electronic Chip
J. S. Rodgers, Proceedings of SPIE, vol. 5732, Quantum Sensing and
Nanophotonic Devices II, March 2005, pp. 511-519.
Minghao Qi, et al., Nature, vol. 429, 538, 2004.
6
7. Motivation: 7
Modeling 3D Photonic Crystals
• Fully 3D PC structures can be modeled using the
finite difference time domain (FDTD) technique.
+
• Typically a PC geometry requires many grid cells
(~107 cells) to resolve even limited number of
periods.
+
• Memory intensive computations virtually
impossible to do on single processor computer.
+
• Simulating 3D PC structures requires parallel HPC
architectures and optimized domain decompositions.
We have developed a 3D FDTD
simulator with the desired
capabilities
7
10. Numerical Methods: 10
Stability
• Simulation timestep (dt) is bounded by the grid cell dimensions.
• Maximum timestep determined by:
“Courant-Frederich-Levy
(CFL) criterion”
R. Courant, et al. , IBM Journal , 215(1967).
10
11. Numerical Methods: 11
Absorbing Boundary Conditions (ABCs)
• FDTD requires unique form of boundary conditions.
• Simulation boundary must be effectively truncated.
• Boundary must :
✓ allow outward wave propagation
✓ allow wave attenuation
✓ minimize reflections back into simulation grid
No ABCs ABCs applied
11
12. (8.2
ping with the
Numerical Methods: 12
Absorbing Boundary Conditions (ABCs)
(al o,p.r, o pmx, Olex, olmx, Kex, and K^, (bl oo.y, 6 pry, otey,o(my,Key, and rc^,
• The Convolutional Perfectly Matched Layer
(CPML) absorbing q
,k). (9.:
boundary conditions is
implemented. the firs:
using r MatchedLayer Distribution
arameter
Kuzuoglu et al,i IEEE Microwave Guided Wave Lett., 6(1996).
,,r1
| ,L:;7 (i, . ;'
i"fr- 1))
Roden Iet al, Microwave Opt. Technol. Lett., 27(5), 1996.
are added t "
(8 .2 7' t
me procedur.
• Highly effective at absorbing: n using the::
tu:
, * =(i,j, k)a nC
k:
• low-frequency evanescent waves (8.2
8a
gb
(8.2
• waveforms in elongated structures rarneters tal;.
of parameteri'
(cl op.., opmz, dnz,Kez,the rcr,
otezt with and
ping (d) Overlapping
(d) Overlapping
Overlapping CPML regions
CPML
CPMLregions
regions
uoF ig. 7 .1.I:
Fgure8.2 Regions parameters defined.
whereCPML are
MatchedLayer
rrMatchedLayer mrameters *',
arameter
arameterDistribution
Distribution 239
239
(bl oo.y, 6 pry, otey,o(my,Key, and rc^,
nrned xIz)r: (al o,p.r, o pmx, Olex, olmx, Kex, and K^,
i"fr-- 1))
i"fr 1)) L pararretei':,
x-regionsf'-r
e defin.6.
y-regionsDISTRIBUTION
8.3 CPMt PARAMET ER z-regions
(8 ..2 7 'tt
(8 27'
r the term L , ",,, q, described in the previous chapter, the PML conductivities are scaled along the PML region
,k ). ( 9. :q
,,*= ((i ,j ,k )a n C
*= i, j, k) a n C nd :p region,* mrting from a zero value at the inner domain-PMl interface and increasing to a maximum value
e requirelTlrrr
r: the outer boundary. In using the firs: case there are two additional types of parameter scaling
the CPML
,,r1
ilsimplies d:r: | from . ;'
(8.2
(8.28a
8a :iofiles, which are different ,L:;7 (i, ithe conductivity scaling profiles.
herefofe,, [,lrl
(8.2gb
(8.2gb The maximum conductivityaddedt the conductivity profile is computed n I23) asing o*o* -
I are of "
d to the fie ,l me procedur.
rr,tar X ooorrwhere n using the::
tu:
ping with the
ping with the , rn-hereas r- ,,,,,,,,,
ring equai(::r:j;
equations ir I
(al o,p.r, o pmx, Olex, olmx, Kex, and K^, (bl oo.y, 6 otey,o(my,
(bl oo.y, 6 pry, otey,o(my, Key, and rc^,
k: vto*l + 1
rarneters tal;. o o p t: ffi (8.3
o)
(al o,p.r, o pmx, Olex, olmx, Kex, and K^, pry, Key, and rc^, (cl op.., opmz, dnz,Kez, rcr,
otezt and (d) Overlapping
(d) Overlapping
CPMLregions
regions
CPML
of parameteri'
uoFig . 7 .1 .I :
Fgure8.2 Regions parameters defined.
whereCPML are
mrameters *',
nrned xIz)r:
(9 ..::q
(9 q L pararretei':,
,,k ) .
k).
e defin.6. f'-r 8 .3 CPMt PAR AMET ERISTR IBU TIO N
D
using the firs: r the term L , ",,, q, described in the previous chapter, the PML conductivities are scaled along the PML
using the firs: 12
,, r1
, ,r 1 nd :p region,*
14. ASU 3D FDTD-PCS Simulator: 14
Features
• Written in ansi-C
• Message Passing Interface (MPI)
• 3D domain decomposition implemented. 3D domain
decomposition
• Redundant computations minimized at boundaries.
super “unit cell”
super “unit cell”
• Scalability preserved via dynamic allocation. with defect
• Parallel I/O routines for data output.
• NO interprocessor communication req’d for output.
• User-defined “unit cells” and defects.
• Simple input file format (GUI in development)
14
15. ASU 3D FDTD-PCS Simulator: 15
Using the simulator
• simple input file
• PC unit cells/defect cells
used to build arbitrary PC
structures.
• Defined structure can be
rotated in any direction.
• Range of frequencies (or
pulse-widths) of excitation
source can be defined.
• User-defined output data.
15
16. ASU 3D FDTD-PCS Simulator: 16
Graphical User Interface (GUI)
• currently in development
• Written using Qt 4.6.2
libraries in C++.
• Window-based input and
job script creation.
•Quickly create, save, modify
simulation parameters.
• Designed to run in
interactive session on AFRL
systems.
16
18. Simulations: 18
Demonstration of 3D Photonic Band Gap
Simulated PC Structure
Sinusoidal
source
Observation point:
(capture E-field values at each
timestep here.)
substrate
(250 x 500 x 250 grid) ~ 3 x 107 cells
18
19. Simulations: 19
Demonstration of 3D Photonic Band Gap
• time-averaged field at the observation
point as a function of frequency
indicates the presence of a photonic
band gap.
•Snapshots of Ez-field profiles validate
the observed optical bandgap.
•10,000 time steps were run for 51
frequencies on 500 processors. This
calculation took less than 12 hours in
real time.
19
20. Simulations: 20
Rotation of Structure
• Rotation of the structure by 30
degrees still yields an optical bandgap. Air
• |Ez| lower after gap because waves are
deflected away from observation point.
Air
•Each frame is snapshot of
(Example of larger grid simulation) |Ez| after 10,000 time steps
(1000 x 1000 x 1000 grid) = 109 cells
20
21. Simulations: 21
Using defects to create waveguides
Cross-sectional view Top down view
+
21
22. Simulations: 22
Using defects to create waveguides
Unit Cell
Top view Simulation on HAWK using 200
cores took 9.5 hours to run.
(900 x 1000 x 360 grid) ~ 3x108 cells
+
Defect Cell
Air
Sinusiodal excitation source
22
23. Simulations: 23
Modeling Defects
•Defects introduced by modifying
the “defect_cell” input file.
23
24. Scalability: 24
3D MIT PC structure on AFRL/ARL systems
4
1000 10
ideal FALCON (600x600x280)
900 FALCON (600x600x280 grid) HAROLD (2000x2000x400)
HAROLD (2000x2000x400 grid) FALCON (2000x2000x400)
FALCON (2000x2000x400 grid)
800
700
600
Wall time (s)
3
Speedup
10
500
400
300
200
2
10
100
100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000 1100
Number of Processors Number of Processors (cores)
24