Major Research



Mechanical Engineering



                             Mechanical Analysis




                         ...
Problem Solving Environment

                Computer Application



  Parallel / Distributed          Graphics User Inter...
MPI (Message - Passing Interface) -- Portable Parallel Programming

                         MPI Applications -- Parallel ...
Multithreaded Programming

 Graphics User Interface:

      X-Window Event + Communication Message
                      ...
Shallow Water Equations Model (SWE)
Performance of Parallel Computation - ParaGraph
Performance of Parallel Computation - Upshot
An Abstract View of the EPPOD System for the Design of Physical Parts


                                     Interactive I...
The Tools of a Problem Solving Environment




                                                                           ...
The Flow Diagram of the Implemented EPPOD System


                                  Start                                ...
EPPOD - Electronic Prototyping for Physical Object Design
                   Interactive System




    FEM Pre-Processor ...
Geometry Definition

XXoX: CSG (Constructive Solid Geometry) Language


cylinder1 = cylinder(0,0,-0.52,1.56,1.04)
cylinder...
XXoX - An Interactive X-Window based Solid Modeling System

   Menu                                                       ...
Interfaces between XXoX and Foreign Softwares




                              Rendering #2


         Rendering #1      ...
Interfaces between XXoX and Foreign Softwares




                              Rendering #2


         Rendering #1      ...
Virtual Reality Modeling Language (VRML) Interface
Parallel Mesh Generation and Decomposition Methodology




1. An adaptive      2. A scheme to      3. A linking rou-   4. ...
Windows of the GUI for Mesh Generation and Splitting


Triangular element mesh generation &
element-wise domain decomposit...
Adaptive Approach



                                  15 refine points
                                         in
      ...
Performance of Parallel Mesh Generation - Engine Rod




   Speedup - parallel : T1 / Tp.                  Speedup - seque...
Performance of Parallel Mesh Generation - Torque Arm




    Speedup - parallel : T1 / Tp.                  Speedup - sequ...
Performance of Parallel Mesh Generation - Engine Axis




    Speedup - parallel : T1 / Tp.                  Speedup - seq...
Performance of Parallel Mesh Generation (Paragon) - Ex1d




                                    SpeedUp : T1 / Tp & T1s /...
Performance of Parallel Mesh Generation (Paragon) - Ex1d




                                    SpeedUp : T1 / Tp & T1s /...
Performance of Parallel Mesh Generation (nCUBE) - Ex3a




                                   SpeedUp : T1 / Tp & T1s / Tp
Performance of Parallel Mesh Generation (nCUBE) - Ex3a




                                   SpeedUp : T1 / Tp & T1s / Tp
Mesh Decomposition and Sparse Matrix - Engine Rod




                   DFS - basic                                      ...
Mesh Decomposition and Sparse Matrix - Engine Cap




                   DFS - basic                                      ...
Mesh Decomposition and Sparse Matrix - Engine Axis




                   DFS - basic                                     ...
Performance of Mesh Decomposition

                     Methods

 CLO      Cartesian Local Optimum
 PLO      Polar Local O...
Numerical Performance for the Traditional Shape Optimization




Iter# 0 (Initial Design)                                 ...
Numerical Performance for the Model Coordination Method




 Iter# 0 (Initial Design)                                     ...
Parallel Electronic Prototyping - Engine Rod
                          X4-4
               X4-3               X4-5


     ...
Parallel Electronic Prototyping - Torque Arm
               X5-3
                                                 Shape Op...
Performance of Parallel Shape Optimization - Engine Rod




                                                      Speedup ...
Performance of Parallel Shape Optimization - Torque Arm




                                                      Speedup ...
Finite Element Analysis Post-Processor - Xcontour




      Contour Lines                       Deformed Shape




Result ...
Upcoming SlideShare
Loading in...5
×

My Ph.D. Research

307

Published on

A Problem Solving Environment for Parallel Design

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
307
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

My Ph.D. Research

  1. 1. Major Research Mechanical Engineering Mechanical Analysis Design Optimization Graphics User Interface Parallel Computation Computer Sciences
  2. 2. Problem Solving Environment Computer Application Parallel / Distributed Graphics User Interface Object-Oriented (OOMPI) Engineering Simulation Portable Protocol (MPI) Shape Optimization Network Computing Finite Element Analysis IP C TCP / IP Automatic Differentiation System-Level Compiler Technology
  3. 3. MPI (Message - Passing Interface) -- Portable Parallel Programming MPI Applications -- Parallel / Distributed Software Portable MPI Library MPI Library MPI Library Sun Solaris multi-processors, SGI nCUBE, Intel IPSC, Intel Paragon, Sun, SGI, PC Linux, DEC, IBM RS- Challenge, Windows NT, OS/2 SMP CM-5 6000 Parallel Extensions NOT Compatible -- Software NOT Portable shared access to main memory messages passing messages passing read / write send / recv sockets, telnet, rlogin Parallel Extension Parallel Extension UNIX UNIX UNIX, WindowsNT, OS/2 UNIX UNIX UNIX CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Mem Mem Mem Mem Mem Mem Mem Mem Memory Shared Memory Machines Distributed Memory Machines Workstation Cluster
  4. 4. Multithreaded Programming  Graphics User Interface: X-Window Event + Communication Message (IPC, TCP/IP)  Client / Server for Parallel Computation: 1. Client: Computing + Receiving Results 2. Server: Sending Results + Computing  Job Scheduling – Task Management: Synchronization + Load Balancing Parallel Computing without Task Manager Parallel Computing with Task Manager  JAVA Threads: 1. Synchronization (Image Loading) 2. Concurrency (Slide Show, Clock)
  5. 5. Shallow Water Equations Model (SWE)
  6. 6. Performance of Parallel Computation - ParaGraph
  7. 7. Performance of Parallel Computation - Upshot
  8. 8. An Abstract View of the EPPOD System for the Design of Physical Parts Interactive Input Structural Analysis Parallel Mesh Sensitivity Analysis XXoX CSG Generation Initial Model Parallel Shape Language Design Selection Optimization yes Error Convergence Parallel Estimate Optimal Domain PATRAN Command yes Improved Design Optimum no Decomposition Language Design Estimate Adaptive Approach no Model Description - new mesh density Mesh Refiner Postprocessing Prototype Analysis & Preprocessing Local Optimization internal iteration Modeling Parallel Design Sensitivity Analysis Parallel Shape external iteration Optimization no Design Optimum Estimate yes Global Optimization
  9. 9. The Tools of a Problem Solving Environment mesh & splitting Geometry Specification geometry specification Mesh Generation Mesh Splitting Structural Analysis Shape Optimization simulation optimization
  10. 10. The Flow Diagram of the Implemented EPPOD System Start FEM Pre-Processor Mesh Generation Domain Decomposition (X-Windows Server) If Restart No Yes ADS Initialization boundary mesh ADS Computation FEA start Stop If Interrupt FEM Processor Yes cancel Displacement Analysis No Stress Analysis Object Analysis If Error Stop (X-Windows Server) Yes FEA result No No If Pause If Converge No Yes Yes FEM Post-Processor FEM mesh Display FEM contour Display Pause Stop Deformation Display FEA result (X-Windows Server)
  11. 11. EPPOD - Electronic Prototyping for Physical Object Design Interactive System FEM Pre-Processor Server FEM Processor Server FEM Post-Processor Server
  12. 12. Geometry Definition XXoX: CSG (Constructive Solid Geometry) Language cylinder1 = cylinder(0,0,-0.52,1.56,1.04) cylinder2 = cylinder(0,0,-0.52,1.04,1.04) cylinder3 = cylinder(0,0,0,0.5,5.4) box1 = box(-0.26,-0.52,1.04,0.52,1.04,6.6) rod = rotate(cylinder1-cylinder2,0,0,0,0,1,0,-90) rod = rod | box1 | translate(scale(rod,1,0.5,0.5),0,0,8.16) rod = rod | translate(rotate(cylinder3,0,0,0,0,1,0,-90),2.6,0,8.16) CSG Tree: UNION UNION TRANSLATE TRANSLATE ROTATE UNION SCALE Cylinder - 3 Box - 1 ROTATE DIFFER Cylinder - 1 Cylinder - 2
  13. 13. XXoX - An Interactive X-Window based Solid Modeling System Menu Information Operation Viewing Drawing Message Command
  14. 14. Interfaces between XXoX and Foreign Softwares Rendering #2 Rendering #1 Rendering #3 VRML PATRAN
  15. 15. Interfaces between XXoX and Foreign Softwares Rendering #2 Rendering #1 Rendering #3 XPoly PATRAN
  16. 16. Virtual Reality Modeling Language (VRML) Interface
  17. 17. Parallel Mesh Generation and Decomposition Methodology 1. An adaptive 2. A scheme to 3. A linking rou- 4. The mesh 5. An optimal mesh algorithm is split the initial tine to form the algorithm of step mesh splitting invoked to gener- mesh into equal- new subdomain 1 is applied to scheme to mini- ate an initial sized subdomains boundaries is generate a finer mize the bisection “coarse” mesh. is applied. called. mesh in parallel. width is applied.
  18. 18. Windows of the GUI for Mesh Generation and Splitting Triangular element mesh generation & element-wise domain decomposition Triangular element mesh generation & node-wise domain decomposition Quadrilateral element mesh generation & element-wise domain decomposition
  19. 19. Adaptive Approach 15 refine points in one adaptive step 136 nodes, 222 elements 79 nodes, 117 elements Deformed shape Deformed shape Stress: <-7789.59, 6769.68> Stress: <-16446.6, 8286.24> Error: 0.164/2.10E-5 Error: 1.427/8.68E-5
  20. 20. Performance of Parallel Mesh Generation - Engine Rod Speedup - parallel : T1 / Tp. Speedup - sequential : T1s / Tp. Utilization Count: states of idle, overhead, and busy as function of time. Engine Rod Utilization Summary: overall cumulative percent- age of time in idle, overhead, and busy states.
  21. 21. Performance of Parallel Mesh Generation - Torque Arm Speedup - parallel : T1 / Tp. Speedup - sequential : T1s / Tp. Utilization Count: states of idle, overhead, and busy as function of time. Torque Arm Utilization Summary: overall cumulative percent- age of time in idle, overhead, and busy states.
  22. 22. Performance of Parallel Mesh Generation - Engine Axis Speedup - parallel : T1 / Tp. Speedup - sequential : T1s / Tp. Utilization Count: states of idle, overhead, and busy as function of time. Engine Axis Utilization Summary: overall cumulative percent- age of time in idle, overhead, and busy states.
  23. 23. Performance of Parallel Mesh Generation (Paragon) - Ex1d SpeedUp : T1 / Tp & T1s / Tp
  24. 24. Performance of Parallel Mesh Generation (Paragon) - Ex1d SpeedUp : T1 / Tp & T1s / Tp
  25. 25. Performance of Parallel Mesh Generation (nCUBE) - Ex3a SpeedUp : T1 / Tp & T1s / Tp
  26. 26. Performance of Parallel Mesh Generation (nCUBE) - Ex3a SpeedUp : T1 / Tp & T1s / Tp
  27. 27. Mesh Decomposition and Sparse Matrix - Engine Rod DFS - basic BFS - strip-wise Communication: 17 / 10 Bandwidth: 4 / 24 Communication: 15 / 8 Bandwidth: 9 / 16 Connectivity: 51 IBV: 28 Connectivity: 30 IBV: 22 BFS - domain-wise Eigenvector Spectral Communication: 8 / 4 Bandwidth: 7 / 15 Communication: 7 / 5 Bandwidth: 7 / 22 Connectivity: 16 IBV: 15 Connectivity: 14 IBV: 12 Cartesian - local optimum Polar - recursive bisection Communication: 7 / 4 Bandwidth: 6 / 15 Communication: 16 / 11 Bandwidth: 2 / 6 Connectivity: 14 IBV: 14 Connectivity: 48 IBV: 29 Polar - local optimum Inertia - first eigenvector Communication: 8 / 5 Bandwidth: 5 / 11 Communication: 12 / 6 Bandwidth: 7 / 17 Connectivity: 16 IBV: 15 Connectivity: 36 IBV: 19
  28. 28. Mesh Decomposition and Sparse Matrix - Engine Cap DFS - basic BFS - strip-wise Communication: 46 / 14 Bandwidth: 5 / 41 Communication: 77 / 42 Bandwidth: 19 / 45 Connectivity: 322 IBV: 222 Connectivity: 154 IBV: 363 BFS - domain-wise Eigenvector Spectral Communication: 25 / 9 Bandwidth: 9 / 24 Communication: 19 / 8 Bandwidth: 11 / 44 Connectivity: 150 IBV: 151 Connectivity: 95 IBV: 125 Cartesian - local optimum Polar - recursive bisection Communication: 25 / 9 Bandwidth: 7 / 17 Communication: 39 / 16 Bandwidth: 3 / 9 Connectivity: 120 IBV: 152 Connectivity: 273 IBV: 188 Polar - local optimum Inertia - first eigenvector Communication: 20 / 8 Bandwidth: 8 / 22 Communication: 57 / 30 Bandwidth: 13 / 42 Connectivity: 80 IBV: 124 Connectivity: 255 IBV: 287
  29. 29. Mesh Decomposition and Sparse Matrix - Engine Axis DFS - basic BFS - strip-wise Communication: 286 / 143 Bandwidth: 39 / 818 Communication: 62 / 43 Bandwidth: 55 / 101 Connectivity: 858 IBV: 412 Connectivity: 124 IBV: 80 BFS - domain-wise Eigenvector Spectral Communication: 18 / 10 Bandwidth: 34 / 73 Communication: 20 / 10 Bandwidth: 111 / 861 Connectivity: 36 IBV: 36 Connectivity: 40 IBV: 36 Cartesian - local optimum Polar - recursive bisection Communication: 67 / 48 Bandwidth: 47 / 163 Communication: 131 / 104 Bandwidth: 13 / 57 Connectivity: 134 IBV: 85 Connectivity: 387 IBV: 251 Polar - local optimum Inertia - first eigenvector Communication: 51 / 27 Bandwidth: 22 / 135 Communication: 65 / 49 Bandwidth: 54 / 219 Connectivity: 102 IBV: 100 Connectivity: 130 IBV: 114
  30. 30. Performance of Mesh Decomposition Methods CLO Cartesian Local Optimum PLO Polar Local Optimum CLE Cartesian Longest Expansion MRSB Multilevel Recursive Spectral Bisection Greedy Domain-Wise BFS RCM Strip-Wise BFS Inert Inertia - First Eigenvector Computation Time Maximum Interface Length Subdomain Connectivity Interpartitioning Boundary Vertices Bandwidth
  31. 31. Numerical Performance for the Traditional Shape Optimization Iter# 0 (Initial Design) Iter# 7-10 (eval# 100-122) Traditional Shape Optimization Iter# 1 (eval# 17) Iter# 5,6 (eval# 81,97) Iter# 2 (eval# 33) Iter# 3 (eval# 49) Iter# 4 (eval# 65)
  32. 32. Numerical Performance for the Model Coordination Method Iter# 0 (Initial Design) Global Iter# 3 Model Coordination Shape Optimization Local Iter# 1 (eval = 5/17) Local Iter# 3 (eval# 15/51) Global Iter# 1 Local Iter# 2 (eval# 10/34) Global Iter# 2
  33. 33. Parallel Electronic Prototyping - Engine Rod X4-4 X4-3 X4-5 X4-6 X4-2 Shape Optimization X4-1 Z = Area  min. X4-7 dispalcement 0.0005 0.1  Xi-j  1.0 i = 1, 2 X3-2 X3-1 j = 1, 2 0.5  X3-j  1.0 j = 1, 2 X2-1 X2-2 0.5  X4-j  1.0 j = 1, , 7 X1-1 X1-2 81 nodes, 117 elements 49 nodes, 63 elements Deformed shape Deformed shape Displacement Displacement Stress Stress
  34. 34. Parallel Electronic Prototyping - Torque Arm X5-3 Shape Optimization X5-2 X5-4 X5-1 X5-5 Z = Area  min. R1=X10 dispalcement 0.0015 X4-1 X4-2 X3-1 X3-2 stress   10000 X2-1 X2-2 0.1  Xi-j  1.0 i = 1, 2 X1-1 X1-2 j = 1, 2 0.5  Xi-j  1.0 i = 3, 4 X6-1 X6-2 j = 1, 2 0.5  X5-j  1.0 j = 1, , 5 X7-1 X7-2 0.2  Xi-j  1.2 i = 6, 7 j = 1, 2 X8-1 X8-2 0.35  X8-j  0.6 R2=X11 j = 1, 2 X9-1 X9-5 0.35  X9-j  1.0 j = 1, , 5 X9-2 X9-4 0.2  R1 = X10  0.45 X9-3 120 nodes, 148 elements 110 nodes, 152 elements 0.15  R2 = X11  0.275 Deformed shape Deformed shape Displacement Displacement Stress Stress
  35. 35. Performance of Parallel Shape Optimization - Engine Rod Speedup - parallel : T1 / Tp. Engine Rod Utilization Summary Utilization Count overall cumulative percentage of time states of idle, overhead, and busy as in idle, overhead, and busy states. function of time. proc#=4 proc#=16
  36. 36. Performance of Parallel Shape Optimization - Torque Arm Speedup - parallel : T1 / Tp. Torque Arm Utilization Summary Utilization Count overall cumulative percentage of time states of idle, overhead, and busy as in idle, overhead, and busy states. function of time. proc#=4 proc#=16
  37. 37. Finite Element Analysis Post-Processor - Xcontour Contour Lines Deformed Shape Result of Structural Stress Numbering of FEM Mesh

×