SlideShare a Scribd company logo
1 of 70
Download to read offline
Faster ray packets
     through
  vertex culling
                 SBR07
Faster Ray Packets - Triangle Intersection through Vertex Culling
                                                                                                                                                                             Faster ray packets
                                                                                                                                                                                  through
                                                                   Alexander Reshetov
                                                                       Intel Corporation

ABSTRACT                                                                           early if either one of the five conditions is false. For implementations on
                                                                                   GPU [14] or Cell [3], branchless algorithms are more efficient.
Acceleration structures are used in ray tracing to sharply reduce number           For most scenes, the great majority of ray- triangle intersection tests can be
of ray-triangle intersection tests at the expense of traversing such               eliminated by creating special acceleration structures such as kd-trees,
structures. Bigger structures eliminate more tests, but their traversal            grids, Bounding Volume Hierarchies (BVH), which exploit spatial
becomes less efficient, especially for ray packets, for which number of            coherency of a scene ([2], [7], [22], [23]). During rendering, the




                                                                                                                                                                               vertex culling
inactive rays increases at the lower levels of the acceleration structures.        acceleration structure is traversed and all triangles in visited leaf nodes are
For dynamic scenes, building or updating acceleration structures is one            tested for intersections. For structures which utilize spatial subdivision
of the major performance impediments.                                              (kd-trees, grids), it is possible to create an instance of this type for which
We propose a new way to reduce the total number of tests by creating a             total number of tests will be just slightly more than the number of ray-
special transient frustum every time a leaf is traversed by a packet of rays.      segments (one test per ray), but this may not result in the best
This frustum contains intersections of active rays with a leaf node and            performance. The optimal size of the acceleration structure is dependent
eliminates over 90% of all potential tests. It allows a tenfold reduction in       on how fast it could be traversed compared with the average speed of the
size of acceleration structure whilst still achieving a better performance.        used ray-triangle intersection test. For hierarchical acceleration structures,
!quot;#$%&'()!!quot;#$%&'&(quot;)'*++!quot;#$*,-(&./+0'.++&0'!()1/+0'. 2 &0(')1*,++()&,03,!&(quot;)!
                                                        + +                        the higher levels of the hierarchy are usually the most effective in reducing
                                                                                   the number of potential tests. Going down in the spatial hierarchy, nodes
1      INTRODUCTION                                                                are becoming smaller and smaller and some of the rays in the packet may
Finding the intersection of a ray and a triangle is equivalent to solving          miss them. This negatively affects the utilization of SIMD units [15].
the linear system of three equations                                               In recent years, focus of ray tracing research has shifted to dynamic
!quot;#quot;$quot;%quot;4quot;&'quot;#quot;(quot;)&*quot;!quot;#'+quot;#quot;,quot;)&-quot;!quot;#'+                                   (1)     scenes ([8], [12], [17], [19], [22], [23]), for which it is necessary to
                                                                                   optimize for the total execution time (build/update time plus rendering
with five additional requirements
                                                                                   time). Frequently, to improve the build time, only axis-aligned bounding
$quot;quot;%quot;quot;"quot;%quot;quot;"!.%quot;                                                         (2)     boxes (AABB) of triangles are used even for kd-trees ([8], [17]). For
$quot;quot;%quot;quot;'(quot;quot;quot;$quot;quot;%quot;quot;)(quot;quot;quot;'*)quot;quot;%quot;quot;+




                                                                                                                                                                            Alexander Reshetov
                                                                           (3)     this reason, it is important to analyze performance or ray-triangle
The left part of the system (1) defines a ray with the origin !quot; and the           intersection tests in the situation when intersection could not be
direction %; the right part ! a point inside the triangle with vertices &'/quot;&*/quot;   eliminated by testing a ray against the AABB of a triangle. This
andquot;&-. The unknown variables are: $ ! distance to the intersection point          approach was chosen by Kensler and Shirley [10], in which different 3D
quot;#$%& '()& #*+,-& $#./.0, and (/quot; , ! Barycentric coordinates of the point         versions of intersection algorithms were analyzed and the best one was
inside the triangle. It is required that the found intersection point be           found via genetic optimization of the fitness function.
closer '$& '()& #*+,-&origin than the previously found one (2) and within          Even when a ray does intersect the AABB, the probability of it
'()&'#.*0/1),-&2$304*#.)-&(3).                                                     intersecting the triangle is only about 20%. It is important to swiftly
One way to solve the system (1) is to use 5#*%)#,-&rule, which allows              reject the remaining 80%. Amongst the approaches analyzed in the
finding numerical values directly for all three variables. This will result        literature [4] are: use of interval arithmetic; testing the intersection of a
in the implementations similar to Möller-Trumbore [13]. On the other               frustum containing rays (typically represented as 4 corner rays) with a
hand, the system (1) also could be solved using Gaussian elimination. It           triangle [5]; and culling of the AABB of the individual triangle against
is computationally efficient only if the distance $ is calculated first            the frustum. In these approaches per-packet data structures are computed
(which, of course, could also be computed directly by using the well-              before the traversal and then used to eliminate unnecessary tests. The
known expression for the distance to a plane along a ray). By                      culling techniques could also be used for the individual rays as well, as
substituting the found value of $ in any two of the remaining equations,           first proposed by Snyder and Barr in 1987 paper [18], in which a ray was
( and , variables could be found. This substitution geometrically                  first tested against the box formed by the intersection of the $2:);',-&
corresponds to projecting the triangle and the intersection point to the           bounding box and a visited cell.




                                                                                                                                                                            IEEE/EG Symposium
plane defined by the coordinates of the two chosen equations and leads             Few years ago, ray-tracing researchers were mostly interested in
to algorithms comparable with one given by Badouel [1]. The S.S.E.                 rendering static scenes for which persistent data structures were created,
implementation for groups of four rays, proposed by Wald [21], is based            optimized for all possible camera positions. Eventually, focus was shifted
on the equivalent 2D projection. Example of the S.S.E. based 3D                    to dynamic scenes, for which data structures are created (or updated)
approach (formulated in terms of Plücker coordinates) could be found in            every frame. There are also initial studies in how to create kd-trees lazily,
6)0'(.0,-& 7(898 [2]. For vector implementations, conditions (2-3) are             deeply subdividing only those areas of space that are visited during
usually converted to masks used to selectively update the $, (, and ,              rendering of a particular frame [9]. In a sense, we pursue this trend to the
values. It makes sense to use packets bigger than the intrinsic SIMD               extreme, when transient data structures are created for every packet and




                                                                                                                                                                                      on
width of the targeted architecture as it amortizes per-triangle                    every visited leaf node. This allows very tight structures, optimized for a
computations and saves bandwidth. It also allows to make decisions                 given packet and a cell. Surprisingly, these fleeting structures could be
summarily for the whole packet without processing individual rays, for             created and handled very efficiently, building on top of the clipping
example using frustum or interval arithmetic ([5], [7], [11], [22], [23]).         algorithms, characteristic for the modern packet traversal techniques.
There is no one universal way of solving the system (1) which will be
suitable for all situations and architectures. In particular, CPU
implementations could use conditions (2-3) for exiting from the test

quot;*+,-.) ,.quot;/,0'quot;&1"(2quot;3%45-03quot;.16%+7




                                                                                                                                                                           Interactive Ray Tracing
'!!,$&,5+&quot;+6777879+:.#$quot;3(%#+quot;)+6)&,0'!&(;,+<'.+=0'!()1+>??@+                     8-9:&quot;7 ;D+=A,+ &0')3(,)&+ H0%3&%#+ (3+!0,'&,5+,;,0.+ &(#,+ '+0'.+ $'!I,&+ ;(3(&3+ '+
A&&$B88CCCD%)(2%*#D5,80&?@8<=?@DA&#*+                                              *,'H+ )quot;5,+ J3quot;*(585'3A,5+ *(),3+ !quot;00,3$quot;)5+ &quot;+ '!&(;,8()'!&(;,+ 0'.3KD+ L)*.+
E*#/+9,0#')./+:,$&,#F,0+G?2G>/+>??@7                                               '!&(;,+ 0'.3+ '0,+ %3,5+ &quot;+ !quot;#$%&,+ &A,+ H0%3&%#D+ M,H&B+ 1,),0'*+ 0'.3D+ <(1A&B+
                                                                                   $0(#'0.+0'.3+J!quot;##quot;)+quot;0(1()KD+




                                                                                                                                                                                     2007
                                                                                                                                                                                           SBR07
Demo


       SBR07
Key contribution


               SBR07
1/10           90 %
Spatial Data
                Culling Rate
Struture Size
                        SBR07
Ray-triangle
intersection cost

                SBR07
1 Cycle
(on average)


               SBR07
It’s
extremely
   fast!
            SBR07
Algorithm


            SBR07
What VC do?
(VC = Vertex Culling)


                        SBR07
Every time
a packet enters
 a leaf node,
 do following
                  SBR07
Create packet

      Precompute coeff

for each triangles
     triangle-packet Cull
                if failed

     triangle-ray intersect
                              SBR07
Create packet

      Precompute coeff

for each triangles
     triangle-packet Cull
                if failed

     triangle-ray intersect
                              SBR07
Create packet
from active rays


       Active ray

       Inactive ray   SBR07
Create packet

      Precompute coeff

for each triangles
     triangle-packet Cull
                if failed

     triangle-ray intersect
                              SBR07
Think about

triangle-packet
      test



          SBR07
Axis aligned




         SBR07
Frustum plane




         SBR07
v
o
    (v - o) . n < 0
n     = outside


                SBR07
v     any of dot(v, n) < 0
    = outside of the frustum




                      SBR07
all of dot(v, n) >= 0
    = inside of the frustum



v




                      SBR07
v0
            all of (v .o) < 0
v1
          = triangle is outside
               of the plane
         v2
     o


     n                   SBR07
v0
             Finally, we got

v1           any of (v . n) > 0
          = frustum and triangle
           (possibly) intersects
     v2



                           SBR07
1 vertex, 4 plane inside/
outside test computation
   can be reduced to

    d = [Vz, -Vy, -Vz, Vy]
+ [Vx, Vx, Vx, Vxx] q1 + q0

                        SBR07
d = [Vz, -Vy, -Vz, Vy]
+ [Vx, Vx, Vx, Vxx] q1 + q0

         q1, q0 =
      SIMD variable.
      Pre-computable
when packet enters leaf node.
                          SBR07
Culling cost
per vertex.

               SBR07
1 SIMD mul
     +
2 SIMD add

             SBR07
Very efficient


                SBR07
This is the core of
         VC

                 SBR07
Create packet

      Precompute coeff

for each triangles
     triangle-packet Cull
                if failed

     triangle-ray intersect
                              SBR07
Just apply efficient
   vertex-frustum
 culling for each
  triangle vertex
                 SBR07
Culling cost per triangle
      per packet =
        3(vtx) *
      3(mul,add) +
        compare
                       SBR07
Create packet

      Precompute coeff

for each triangles
     triangle-packet Cull
                if failed

     triangle-ray intersect
                              SBR07
SBR07
First do
corner rays and
    triangle
       test
              SBR07
Such a case can not
 be culled by VC.


              SBR07
SBR07
all(u) < 0 or
         all(v) < 0 or
v   u
         all(u+v) > 1
               =
    All rays in the packet
      does not intersect
           triangle.SBR07
If all failed,
        do
ray-triangle test
for each rays in
  the packet.   SBR07
Thats all!


             SBR07
Benefit of VC


               SBR07
1/10           90 %
Spatial Data
                Culling Rate
Struture Size
                        SBR07
less traversal   much traversal
  much VCs         less VCs
                           SBR07
Right has 12x more
  nodes than left,
but right is just 30%
  faster than left.     SBR07
spatial data structure
Coarse                      Fine
Sequential access   Random access
    more isects       less isects
  less traversals   more traversals
   less memory      much memory
                          BVH
                          BIH
                         kd-tree
                               SBR07
spatial data structure
Coarse                      Fine
Sequential access    Random access
    more isects        less isects
  less traversals    more traversals
   less memory       much memory


                VC            SBR07
The strongest
acceleration
  structure
on the earth?
                SBR07
Implementation


                 SBR07
Here is my
implementation of
    VC + BVH
               SBR07
C++, for VC   C, for gcc



              My impl.




 Original          SBR07
Performance
     &
  Profiling
              SBR07
512x512
 10k tris
SAH-BVH     SBR07
Compile flag
       -O3
  -ffast-math
-mfpmath=sse
     -msse2     SBR07
4.7M rays/sec

 @ 2.16 GHz
 Core2 Mac
                SBR07
= 460 Cycles/ray

    Hmm...
               SBR07
Statistics
nPackets                           :   8192
nTraversedLeafs(per packet)        :   43066 (5.257080)
nTraversedInnerNodes(per packet)   :   109922 (13.418213)
nPacketBBoxTests(per packet)       :   228036 (27.836426)
  nBVHAnyHits                      :   131968 (57.871564 %)
  nBVHAllMisses                    :   61462 (26.952762 %)
  nBVHLastResorts                  :   34606 (15.175674 %)
    nBVHLastResortRayBBoxTests     :   322288
    (per last resort tests)        :   9.313067
nTestedTriangles(per packet)       :   314650 (38.409424)
  nCulledByBeams(rate)             :   223618 (71.068807 %)
  nCulledByUV(rate)                :   18536 (5.890990 %)
nActuallyTestedTrianglesPerRay     :   2.060669

                                                      SBR07
Core2 Mac




            SBR07
BVH


  VC




SBR07
Redundant calcs...

              SBR07
Non SIMDized
  codes...
          SBR07
Yeah, I have to
  optimize BVH
   code before
optimizing VC :-)
                SBR07
Implementation
    period

                 SBR07
Vertex    packet
Culling    BVH



6 Days    2 Days
               SBR07
Very easy to implement!

MLRTA requires a half
  year in my case.

                     SBR07
TODO


       SBR07
Much more
optimization.

                SBR07
Try to use VC
 technique for
incoherent rays
                  SBR07
Here is my
           VC + BVH
         implementation

http://lucille.svn.sourceforge.net/svnroot/lucille/
                 angelina/eleonore/

                                               SBR07
?


    SBR07
Thank you!


             SBR07

More Related Content

Similar to SBR07 Faster Ray Packets - Triangle Intersection through Vertex Culling

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Revisiting Projection Methods over Automatic Oct-tree Meshes
Revisiting Projection Methods over Automatic Oct-tree MeshesRevisiting Projection Methods over Automatic Oct-tree Meshes
Revisiting Projection Methods over Automatic Oct-tree MeshesAbhishek Jain
 
M ESH S IMPLIFICATION V IA A V OLUME C OST M EASURE
M ESH S IMPLIFICATION V IA A V OLUME C OST M EASUREM ESH S IMPLIFICATION V IA A V OLUME C OST M EASURE
M ESH S IMPLIFICATION V IA A V OLUME C OST M EASUREijcga
 
Testing the global grid of master events for waveform cross correlation with ...
Testing the global grid of master events for waveform cross correlation with ...Testing the global grid of master events for waveform cross correlation with ...
Testing the global grid of master events for waveform cross correlation with ...Ivan Kitov
 
Rabid Euclidean direction search algorithm for various adaptive array geometries
Rabid Euclidean direction search algorithm for various adaptive array geometriesRabid Euclidean direction search algorithm for various adaptive array geometries
Rabid Euclidean direction search algorithm for various adaptive array geometriesjournalBEEI
 
01 4941 10014-1-sm ediqbal
01 4941 10014-1-sm ediqbal01 4941 10014-1-sm ediqbal
01 4941 10014-1-sm ediqbalIAESIJEECS
 
Investigation of repeated blasts at Aitik mine using waveform cross correlation
Investigation of repeated blasts at Aitik mine using waveform cross correlationInvestigation of repeated blasts at Aitik mine using waveform cross correlation
Investigation of repeated blasts at Aitik mine using waveform cross correlationIvan Kitov
 
Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...IOSR Journals
 
Effect of Residual Modes on Dynamically Condensed Spacecraft Structure
Effect of Residual Modes on Dynamically Condensed Spacecraft StructureEffect of Residual Modes on Dynamically Condensed Spacecraft Structure
Effect of Residual Modes on Dynamically Condensed Spacecraft StructureIRJET Journal
 
An enhanced fletcher-reeves-like conjugate gradient methods for image restora...
An enhanced fletcher-reeves-like conjugate gradient methods for image restora...An enhanced fletcher-reeves-like conjugate gradient methods for image restora...
An enhanced fletcher-reeves-like conjugate gradient methods for image restora...IJECEIAES
 
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...Applied Research and Photonics, Inc.
 
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...Applied Research and Photonics, Inc.
 
Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...
Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...
Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...CSCJournals
 
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...jantjournal
 
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...jantjournal
 
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...jantjournal
 

Similar to SBR07 Faster Ray Packets - Triangle Intersection through Vertex Culling (20)

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Revisiting Projection Methods over Automatic Oct-tree Meshes
Revisiting Projection Methods over Automatic Oct-tree MeshesRevisiting Projection Methods over Automatic Oct-tree Meshes
Revisiting Projection Methods over Automatic Oct-tree Meshes
 
Gdistance vignette
Gdistance vignetteGdistance vignette
Gdistance vignette
 
M ESH S IMPLIFICATION V IA A V OLUME C OST M EASURE
M ESH S IMPLIFICATION V IA A V OLUME C OST M EASUREM ESH S IMPLIFICATION V IA A V OLUME C OST M EASURE
M ESH S IMPLIFICATION V IA A V OLUME C OST M EASURE
 
Testing the global grid of master events for waveform cross correlation with ...
Testing the global grid of master events for waveform cross correlation with ...Testing the global grid of master events for waveform cross correlation with ...
Testing the global grid of master events for waveform cross correlation with ...
 
Rabid Euclidean direction search algorithm for various adaptive array geometries
Rabid Euclidean direction search algorithm for various adaptive array geometriesRabid Euclidean direction search algorithm for various adaptive array geometries
Rabid Euclidean direction search algorithm for various adaptive array geometries
 
01 4941 10014-1-sm ediqbal
01 4941 10014-1-sm ediqbal01 4941 10014-1-sm ediqbal
01 4941 10014-1-sm ediqbal
 
Investigation of repeated blasts at Aitik mine using waveform cross correlation
Investigation of repeated blasts at Aitik mine using waveform cross correlationInvestigation of repeated blasts at Aitik mine using waveform cross correlation
Investigation of repeated blasts at Aitik mine using waveform cross correlation
 
Structure prediction
Structure predictionStructure prediction
Structure prediction
 
Final Beam Paper
Final Beam PaperFinal Beam Paper
Final Beam Paper
 
synopsis_divyesh
synopsis_divyeshsynopsis_divyesh
synopsis_divyesh
 
Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...
 
Effect of Residual Modes on Dynamically Condensed Spacecraft Structure
Effect of Residual Modes on Dynamically Condensed Spacecraft StructureEffect of Residual Modes on Dynamically Condensed Spacecraft Structure
Effect of Residual Modes on Dynamically Condensed Spacecraft Structure
 
An enhanced fletcher-reeves-like conjugate gradient methods for image restora...
An enhanced fletcher-reeves-like conjugate gradient methods for image restora...An enhanced fletcher-reeves-like conjugate gradient methods for image restora...
An enhanced fletcher-reeves-like conjugate gradient methods for image restora...
 
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
 
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
Nanoscale metrology of line patterns on semiconductor by continuous wave tera...
 
Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...
Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...
Segmentation Based Multilevel Wide Band Compression for SAR Images Using Coif...
 
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
 
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
 
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
ARRAY FACTOR OPTIMIZATION OF AN ACTIVE PLANAR PHASED ARRAY USING EVOLUTIONARY...
 

More from Syoyo Fujita

More from Syoyo Fujita (7)

Herd
HerdHerd
Herd
 
Docker infiniband
Docker infinibandDocker infiniband
Docker infiniband
 
DIY InfiniBand networking
DIY InfiniBand networkingDIY InfiniBand networking
DIY InfiniBand networking
 
MicroServer + InfiniBand + ZFS
MicroServer + InfiniBand + ZFSMicroServer + InfiniBand + ZFS
MicroServer + InfiniBand + ZFS
 
AObench with Emscripten
AObench with EmscriptenAObench with Emscripten
AObench with Emscripten
 
First Order
First OrderFirst Order
First Order
 
Muda Proposal
Muda ProposalMuda Proposal
Muda Proposal
 

Recently uploaded

Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewasmakika9823
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneVIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneCall girls in Ahmedabad High profile
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfPaul Menig
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCRsoniya singh
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Serviceankitnayak356677
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 

Recently uploaded (20)

Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneVIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdf
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 

SBR07 Faster Ray Packets - Triangle Intersection through Vertex Culling

  • 1. Faster ray packets through vertex culling SBR07
  • 2. Faster Ray Packets - Triangle Intersection through Vertex Culling Faster ray packets through Alexander Reshetov Intel Corporation ABSTRACT early if either one of the five conditions is false. For implementations on GPU [14] or Cell [3], branchless algorithms are more efficient. Acceleration structures are used in ray tracing to sharply reduce number For most scenes, the great majority of ray- triangle intersection tests can be of ray-triangle intersection tests at the expense of traversing such eliminated by creating special acceleration structures such as kd-trees, structures. Bigger structures eliminate more tests, but their traversal grids, Bounding Volume Hierarchies (BVH), which exploit spatial becomes less efficient, especially for ray packets, for which number of coherency of a scene ([2], [7], [22], [23]). During rendering, the vertex culling inactive rays increases at the lower levels of the acceleration structures. acceleration structure is traversed and all triangles in visited leaf nodes are For dynamic scenes, building or updating acceleration structures is one tested for intersections. For structures which utilize spatial subdivision of the major performance impediments. (kd-trees, grids), it is possible to create an instance of this type for which We propose a new way to reduce the total number of tests by creating a total number of tests will be just slightly more than the number of ray- special transient frustum every time a leaf is traversed by a packet of rays. segments (one test per ray), but this may not result in the best This frustum contains intersections of active rays with a leaf node and performance. The optimal size of the acceleration structure is dependent eliminates over 90% of all potential tests. It allows a tenfold reduction in on how fast it could be traversed compared with the average speed of the size of acceleration structure whilst still achieving a better performance. used ray-triangle intersection test. For hierarchical acceleration structures, !quot;#$%&'()!!quot;#$%&'&(quot;)'*++!quot;#$*,-(&./+0'.++&0'!()1/+0'. 2 &0(')1*,++()&,03,!&(quot;)! + + the higher levels of the hierarchy are usually the most effective in reducing the number of potential tests. Going down in the spatial hierarchy, nodes 1 INTRODUCTION are becoming smaller and smaller and some of the rays in the packet may Finding the intersection of a ray and a triangle is equivalent to solving miss them. This negatively affects the utilization of SIMD units [15]. the linear system of three equations In recent years, focus of ray tracing research has shifted to dynamic !quot;#quot;$quot;%quot;4quot;&'quot;#quot;(quot;)&*quot;!quot;#'+quot;#quot;,quot;)&-quot;!quot;#'+ (1) scenes ([8], [12], [17], [19], [22], [23]), for which it is necessary to optimize for the total execution time (build/update time plus rendering with five additional requirements time). Frequently, to improve the build time, only axis-aligned bounding $quot;quot;%quot;quot;&quot;quot;%quot;quot;&quot;!.%quot; (2) boxes (AABB) of triangles are used even for kd-trees ([8], [17]). For $quot;quot;%quot;quot;'(quot;quot;quot;$quot;quot;%quot;quot;)(quot;quot;quot;'*)quot;quot;%quot;quot;+ Alexander Reshetov (3) this reason, it is important to analyze performance or ray-triangle The left part of the system (1) defines a ray with the origin !quot; and the intersection tests in the situation when intersection could not be direction %; the right part ! a point inside the triangle with vertices &'/quot;&*/quot; eliminated by testing a ray against the AABB of a triangle. This andquot;&-. The unknown variables are: $ ! distance to the intersection point approach was chosen by Kensler and Shirley [10], in which different 3D quot;#$%& '()& #*+,-& $#./.0, and (/quot; , ! Barycentric coordinates of the point versions of intersection algorithms were analyzed and the best one was inside the triangle. It is required that the found intersection point be found via genetic optimization of the fitness function. closer '$& '()& #*+,-&origin than the previously found one (2) and within Even when a ray does intersect the AABB, the probability of it '()&'#.*0/1),-&2$304*#.)-&(3). intersecting the triangle is only about 20%. It is important to swiftly One way to solve the system (1) is to use 5#*%)#,-&rule, which allows reject the remaining 80%. Amongst the approaches analyzed in the finding numerical values directly for all three variables. This will result literature [4] are: use of interval arithmetic; testing the intersection of a in the implementations similar to Möller-Trumbore [13]. On the other frustum containing rays (typically represented as 4 corner rays) with a hand, the system (1) also could be solved using Gaussian elimination. It triangle [5]; and culling of the AABB of the individual triangle against is computationally efficient only if the distance $ is calculated first the frustum. In these approaches per-packet data structures are computed (which, of course, could also be computed directly by using the well- before the traversal and then used to eliminate unnecessary tests. The known expression for the distance to a plane along a ray). By culling techniques could also be used for the individual rays as well, as substituting the found value of $ in any two of the remaining equations, first proposed by Snyder and Barr in 1987 paper [18], in which a ray was ( and , variables could be found. This substitution geometrically first tested against the box formed by the intersection of the $2:);',-& corresponds to projecting the triangle and the intersection point to the bounding box and a visited cell. IEEE/EG Symposium plane defined by the coordinates of the two chosen equations and leads Few years ago, ray-tracing researchers were mostly interested in to algorithms comparable with one given by Badouel [1]. The S.S.E. rendering static scenes for which persistent data structures were created, implementation for groups of four rays, proposed by Wald [21], is based optimized for all possible camera positions. Eventually, focus was shifted on the equivalent 2D projection. Example of the S.S.E. based 3D to dynamic scenes, for which data structures are created (or updated) approach (formulated in terms of Plücker coordinates) could be found in every frame. There are also initial studies in how to create kd-trees lazily, 6)0'(.0,-& 7(898 [2]. For vector implementations, conditions (2-3) are deeply subdividing only those areas of space that are visited during usually converted to masks used to selectively update the $, (, and , rendering of a particular frame [9]. In a sense, we pursue this trend to the values. It makes sense to use packets bigger than the intrinsic SIMD extreme, when transient data structures are created for every packet and on width of the targeted architecture as it amortizes per-triangle every visited leaf node. This allows very tight structures, optimized for a computations and saves bandwidth. It also allows to make decisions given packet and a cell. Surprisingly, these fleeting structures could be summarily for the whole packet without processing individual rays, for created and handled very efficiently, building on top of the clipping example using frustum or interval arithmetic ([5], [7], [11], [22], [23]). algorithms, characteristic for the modern packet traversal techniques. There is no one universal way of solving the system (1) which will be suitable for all situations and architectures. In particular, CPU implementations could use conditions (2-3) for exiting from the test quot;*+,-.) ,.quot;/,0'quot;&1&quot;(2quot;3%45-03quot;.16%+7 Interactive Ray Tracing '!!,$&,5+&quot;+6777879+:.#$quot;3(%#+quot;)+6)&,0'!&(;,+<'.+=0'!()1+>??@+ 8-9:&quot;7 ;D+=A,+ &0')3(,)&+ H0%3&%#+ (3+!0,'&,5+,;,0.+ &(#,+ '+0'.+ $'!I,&+ ;(3(&3+ '+ A&&$B88CCCD%)(2%*#D5,80&?@8<=?@DA&#*+ *,'H+ )quot;5,+ J3quot;*(585'3A,5+ *(),3+ !quot;00,3$quot;)5+ &quot;+ '!&(;,8()'!&(;,+ 0'.3KD+ L)*.+ E*#/+9,0#')./+:,$&,#F,0+G?2G>/+>??@7 '!&(;,+ 0'.3+ '0,+ %3,5+ &quot;+ !quot;#$%&,+ &A,+ H0%3&%#D+ M,H&B+ 1,),0'*+ 0'.3D+ <(1A&B+ $0(#'0.+0'.3+J!quot;##quot;)+quot;0(1()KD+ 2007 SBR07
  • 3. Demo SBR07
  • 5. 1/10 90 % Spatial Data Culling Rate Struture Size SBR07
  • 8. It’s extremely fast! SBR07
  • 9. Algorithm SBR07
  • 10. What VC do? (VC = Vertex Culling) SBR07
  • 11. Every time a packet enters a leaf node, do following SBR07
  • 12. Create packet Precompute coeff for each triangles triangle-packet Cull if failed triangle-ray intersect SBR07
  • 13. Create packet Precompute coeff for each triangles triangle-packet Cull if failed triangle-ray intersect SBR07
  • 14. Create packet from active rays Active ray Inactive ray SBR07
  • 15. Create packet Precompute coeff for each triangles triangle-packet Cull if failed triangle-ray intersect SBR07
  • 17. Axis aligned SBR07
  • 19. v o (v - o) . n < 0 n = outside SBR07
  • 20. v any of dot(v, n) < 0 = outside of the frustum SBR07
  • 21. all of dot(v, n) >= 0 = inside of the frustum v SBR07
  • 22. v0 all of (v .o) < 0 v1 = triangle is outside of the plane v2 o n SBR07
  • 23. v0 Finally, we got v1 any of (v . n) > 0 = frustum and triangle (possibly) intersects v2 SBR07
  • 24. 1 vertex, 4 plane inside/ outside test computation can be reduced to d = [Vz, -Vy, -Vz, Vy] + [Vx, Vx, Vx, Vxx] q1 + q0 SBR07
  • 25. d = [Vz, -Vy, -Vz, Vy] + [Vx, Vx, Vx, Vxx] q1 + q0 q1, q0 = SIMD variable. Pre-computable when packet enters leaf node. SBR07
  • 27. 1 SIMD mul + 2 SIMD add SBR07
  • 29. This is the core of VC SBR07
  • 30. Create packet Precompute coeff for each triangles triangle-packet Cull if failed triangle-ray intersect SBR07
  • 31. Just apply efficient vertex-frustum culling for each triangle vertex SBR07
  • 32. Culling cost per triangle per packet = 3(vtx) * 3(mul,add) + compare SBR07
  • 33. Create packet Precompute coeff for each triangles triangle-packet Cull if failed triangle-ray intersect SBR07
  • 34. SBR07
  • 35. First do corner rays and triangle test SBR07
  • 36. Such a case can not be culled by VC. SBR07
  • 37. SBR07
  • 38. all(u) < 0 or all(v) < 0 or v u all(u+v) > 1 = All rays in the packet does not intersect triangle.SBR07
  • 39. If all failed, do ray-triangle test for each rays in the packet. SBR07
  • 40. Thats all! SBR07
  • 42. 1/10 90 % Spatial Data Culling Rate Struture Size SBR07
  • 43. less traversal much traversal much VCs less VCs SBR07
  • 44. Right has 12x more nodes than left, but right is just 30% faster than left. SBR07
  • 45. spatial data structure Coarse Fine Sequential access Random access more isects less isects less traversals more traversals less memory much memory BVH BIH kd-tree SBR07
  • 46. spatial data structure Coarse Fine Sequential access Random access more isects less isects less traversals more traversals less memory much memory VC SBR07
  • 47. The strongest acceleration structure on the earth? SBR07
  • 49. Here is my implementation of VC + BVH SBR07
  • 50. C++, for VC C, for gcc My impl. Original SBR07
  • 51. Performance & Profiling SBR07
  • 53. Compile flag -O3 -ffast-math -mfpmath=sse -msse2 SBR07
  • 54. 4.7M rays/sec @ 2.16 GHz Core2 Mac SBR07
  • 55. = 460 Cycles/ray Hmm... SBR07
  • 56. Statistics nPackets : 8192 nTraversedLeafs(per packet) : 43066 (5.257080) nTraversedInnerNodes(per packet) : 109922 (13.418213) nPacketBBoxTests(per packet) : 228036 (27.836426) nBVHAnyHits : 131968 (57.871564 %) nBVHAllMisses : 61462 (26.952762 %) nBVHLastResorts : 34606 (15.175674 %) nBVHLastResortRayBBoxTests : 322288 (per last resort tests) : 9.313067 nTestedTriangles(per packet) : 314650 (38.409424) nCulledByBeams(rate) : 223618 (71.068807 %) nCulledByUV(rate) : 18536 (5.890990 %) nActuallyTestedTrianglesPerRay : 2.060669 SBR07
  • 57. Core2 Mac SBR07
  • 60. Non SIMDized codes... SBR07
  • 61. Yeah, I have to optimize BVH code before optimizing VC :-) SBR07
  • 62. Implementation period SBR07
  • 63. Vertex packet Culling BVH 6 Days 2 Days SBR07
  • 64. Very easy to implement! MLRTA requires a half year in my case. SBR07
  • 65. TODO SBR07
  • 67. Try to use VC technique for incoherent rays SBR07
  • 68. Here is my VC + BVH implementation http://lucille.svn.sourceforge.net/svnroot/lucille/ angelina/eleonore/ SBR07
  • 69. ? SBR07
  • 70. Thank you! SBR07