• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Seminario Fabio Marton, 4-10-2012
 

Seminario Fabio Marton, 4-10-2012

on

  • 772 views

È un problema ormai comune quello di cercare di visualizzare in tempo reale modelli di grandi dimensioni. Modelli di grandi dimensioni sono ormai diffusi nel cinema, nei videogiochi, nella ...

È un problema ormai comune quello di cercare di visualizzare in tempo reale modelli di grandi dimensioni. Modelli di grandi dimensioni sono ormai diffusi nel cinema, nei videogiochi, nella progettazione CAD, nelle immagini mediche, analisi sismiche, dati del territorio, ecc.., e la loro visualizzazione risulta problematica. Questo seminario espone le tecniche che attualmente sono in grado di superare queste limitazioni per rendere possibile la visualizzazione in tempo reale di grandi modelli 3D.

Statistics

Views

Total Views
772
Views on SlideShare
772
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Seminario Fabio Marton, 4-10-2012 Seminario Fabio Marton, 4-10-2012 Presentation Transcript

    • www.crs4.it/vic/ Massive Model Rendering Fabio Marton CRS4 Visual Computing
    • F. Marton– CRS4/Visual Computing, October 2012Goal: interactive inspection ofmassive models on PC platforms… Massive datasets rendered on a commodity PC
    • F. Marton– CRS4/Visual Computing, October 2012Application domains / data sources Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riav v iare il computer e aprire di nuovo il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. Local Terrain Models 2.5D – Flat – Dense regular sampling • Many important application domains Planetary terrain models 2.5D – Spherical – Dense • Today’s models exceed regular sampling – O(108-1010) samples Laser scanned models – O(109-1011) bytes 3D – Moderately simple topology – • Varying low depth complexity - dense – Dimensionality CAD models 3D – complex topology – high – Topology depth complexity – structured – Sampling distribution - ‘ugly’ mesh Natural objects / Simulation results 3D – complex topology + high depth complexity + unstructured/high frequency details
    • F. Marton– CRS4/Visual Computing, October 2012The (minimal) challenge: real-time real-rendering of massive static models• Explore very large models at interactive rates – Update screen at “interactive rates” as viewpoint changes View parameters Storage Screen I/O I/O Projection + Visibility + Shading Limited bandwidth (network/disk/RAM/CPU/PCIe/GPU/…) Giga/Tera Bytes Mega Pixels/frame at 10/100 fps
    • F. Marton– CRS4/Visual Computing, October 2012A real-time data filtering problem! real-• Models of unbounded complexity on limited computers – Need for output-sensitive techniques (O(N), not O(K)) →∞) • We assume less data on screen (N) than in model (K →∞ – Need for memory-efficient techniques (maximize cache hits!) – Need for parallel techniques (maximize CPU/GPU core usage) View parameters Storage Screen I/O Projection + Visibility + Shading Limited bandwidth (network/disk/RAM/CPU/PCIe/GPU/…) O(K=unbounded) bytes 10-100 Hz (triangles, points, …) O(N=1M-100M) pixels
    • F. Marton– CRS4/Visual Computing, October 2012A real-time data filtering problem! real-• Models of unbounded complexity on limited computers – Need for output-sensitive techniques (O(N), not O(K)) →∞) • We assume less data on screen (N) than in model (K →∞ – Need for memory-efficient techniques (maximize cache hits!) – Need for parallel techniques (maximize CPU/GPU core usage) View parameters Storage Screen Small I/O Working Set Projection + Visibility + Shading Limited bandwidth (network/disk/RAM/CPU/PCIe/GPU/…) O(K=unbounded) bytes 10-100 Hz (triangles, points, …) O(N=1M-100M) pixels
    • F. Marton– CRS4/Visual Computing, October 2012 Output- Output-sensitive techniques• At preprocessing time: build MR COARSE hierarchy – Data prefiltering! – Visibility + simplification – Not output sensitive• At run-time: selective view-dependent refinement from out- of-core data FINE – Must be output sensitive – Access to prefiltered data under real-time constraints – Visibility + LOD
    • F. Marton– CRS4/Visual Computing, October 2012 Output- Output-sensitive techniques• At preprocessing time: build MR hierarchy – Data prefiltering! FRONT – Visibility + simplification – Not output sensitive• At run-time: selective view-dependent refinement from out- of-core data – Must be output sensitive Occluded / Out-of-view – Access to prefiltered data under Inaccurate real-time constraints Accurate – Visibility + LOD
    • F. Marton– CRS4/Visual Computing, October 2012 Our contributions GPU- GPU-friendly output-sensitive techniques output-• Chunk-based multiresolution structures – Combine space partitioning + level of detail – Same structure used for visibility and detail culling• Seamless combination of chunks Partitioning and Adaptive Cache rendering – Dependencies ensure consistency at the level of simplification GPU chunks• Complex rendering primitives Off-line On-line – GPU programming features – Curvilinear patches, view-dependent voxels, … Network /• Chunk-based external memory Bus management – Compression/decompression, block transfers, caching Multiresolution structure (data+dependency)
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Adaptive Tetrapuzzles – Dense meshes Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) SIGGRAPH 2004 Layered Point Clouds – Dense clouds Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 Blockmaps – Hybrid volumetric city model Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) EG 2007 MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Adaptive Tetrapuzzles – Dense meshes RASTERIZATION Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) SIGGRAPH 2004 Layered Point Clouds – Dense clouds Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 Blockmaps – Hybrid volumetric city model Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) EG 2007 MOVR – Volumetric models RAYCASTING Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Adaptive Tetrapuzzles – Dense meshes MESH-BASED FRAMEWORK Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) SIGGRAPH 2004 Layered Point Clouds – Dense clouds Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 Blockmaps – Hybrid volumetric city model MESH-LESS FRAMEWORK Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) EG 2007 MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Specialize Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi- Triangulations Gobbetti/Marton (CRS4), Adaptive Tetrapuzzles – Dense meshes Cignoni/ Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno SIGGRAPH 2004 (CNR) IEEE Viz 2005 Layered Point Clouds – Dense clouds Generalize Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Specialize Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 View-dep. Blockmaps – Hybrid volumetric city model Volumetric Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) Model EG 2007 In progress MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008 Generalize
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Specialize Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi- Triangulations Gobbetti/Marton (CRS4), Adaptive Tetrapuzzles – Dense meshes Cignoni/ Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno SIGGRAPH 2004 (CNR) IEEE Viz 2005 Layered Point Clouds – Dense clouds Generalize Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Specialize Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 View-dep. Blockmaps – Hybrid volumetric city model Volumetric Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) Model EG 2007 In progress MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008 Generalize
    • F. Marton– CRS4/Visual Computing, October 2012Real-Real-time adaptive meshes• The problem: efficiently create view-dependent meshes• Constraints: – must approximate original surface with controlled screen-space error – must preserve continuity (conforming meshes) – must handle meshes of varying topology – must be efficiently rendered
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsThe Multi Triangulation Framework• Theoretical basis – MT multiresolution framework (Puppo 1996) Partitioning Cache Adaptive• Our contribution and simplification rendering GPU – GPU friendly implementation based on surface chunks Off-line On-line with boundary constraints Network / – Optimized implicit Bus specializations (TetraPuzzles/V-Partitions) Multiresolution – Parallel out-of-core pre- structure (data+dependency) processing and out-of-core run-time Cignoni, Ganovelli, Gobbetti, Marton, Ponchio, and Scopigno. Batched Multi Triangulation. In Proc. IEEE Visualization. Pages 207-214. October 2005.
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsThe Multi Triangulation Framework• Consider a sequence of local modifications over a given description D – Each modification replaces a portion of the domain with a different conforming portion (simplified) – f1 floor – g1 the new fragment D’=D f∪ g Di+1=Di⊕ gi+1
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsThe Multi Triangulation Framework• Dependencies between modifications can be arranged in a DAG
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsThe Multi Triangulation Framework• Dependencies between modifications can be arranged in a DAG – Adding a sink to the DAG we can associate each fragment to an arc leaving a node
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsMT Cuts• A cut of the DAG defines a new representation – Just paste all the fragments above the cut D*=D0 ⊕ g1 ⊕ g4
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsMT Cuts• A cut of the DAG defines a new representation – Collect all the fragment floors of cut arcs and you get a new conforming meshD*=D0 ⊕ g1 ⊕ g4 = f0∞ ∪ f02 ∪ f03 ∪ f13 ∪ f1∞ ∪ f4∞
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsGPU Friendly MT• Chunked MT assume fragments are triangle patches with proper boundary constraints – DAG << original mesh (patches composed by thousands of tri) – Structure memory + traversal overhead amortized over thousands of triangles – Per-patch optimizations
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsGPU Friendly MT• Chunked MT assume regions provide good hierarchical space- partitioning – Compact • Close-to-spherical – Used for computing fast projected error upper bounds – Used for visibility queries
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsGPU Friendly MT • Construction – Start with hires triangle soup – Partition model using a hierarchical space partitioning scheme – Construct non-leaf cells by bottom-up recombination and simplification of lower level cells – Assign model space errors to cells • Rendering – Refine conformal hierarchy, Cache Adaptive render selected precomputed rendering GPU cells – Project errors to screen On-line – Dual queue
    • F. Marton– CRS4/Visual Computing, October 2012Chunked Multi TriangulationsDAG problems• Not all MTs are good MTs! – The topology of dependencies may lower the adaptivity of the multiresolution structure • Cascading dependencies are BAD!!! – The geometry of DAG regions may cause problems in view- dependent rendering • Compact regions• Proposed solutions: – SIGGRAPH 2004: Efficient constrained technique (TetraPuzzles) – IEEE Viz 2005: General construction technique (V- Partition) – … see also QVDR, IEEE Viz 2004 and other related work…
    • F. Marton– CRS4/Visual Computing, October 2012Adaptive TetraPuzzles • Construction – Start with hires triangle soup – Partition model using a conformal hierarchy of tetrahedra – Construct non-leaf cells by bottom-up recombination and simplification of lower level cells • Rendering – Refine conformal hierarchy, render selected precomputed cells
    • F. Marton– CRS4/Visual Computing, October 2012Adaptive TetraPuzzles • Construction – Start with hires triangle soup – Partition model using a conformal hierarchy of tetrahedra – Construct non-leaf cells by bottom-up recombination and simplification of lower level cells • Rendering – Refine conformal hierarchy, render selected precomputed cells
    • F. Marton– CRS4/Visual Computing, October 2012Adaptive TetraPuzzlesOverview • Construction – Start with hires triangle soup – Partition model using a conformal hierarchy of tetrahedra – Construct non-leaf cells by bottom-up recombination and simplification of lower level cells • RenderingView dependent mesh – Refine conformal refinement hierarchy, render selected precomputed cells
    • F. Marton– CRS4/Visual Computing, October 2012 Adaptive TetraPuzzles ResultsMichelangelo’s St. MatthewSource: Digital MichelangeloProjectData: 374M trianglesIntel Xeon 2.4GHz 1GBGeForce FX 5800U AGP8X
    • F. Marton– CRS4/Visual Computing, October 2012Advantages of mesh-based mesh-multiresolution models• First GPU bound methods for very large meshes – Adaptive conforming meshes • Reduced overdraw – Extensive optimization • Stripification, cache coherence, compression, … – State of the art performance • GPU bound, >4Mtri/frame at >30 fps on modern GPUs• Extremely high quality for large dense models with “well behaved” surface
    • F. Marton– CRS4/Visual Computing, October 2012Limitations of mesh-based mesh-multiresolution models• Visibility and multiresolution solved as separate problems – Error measured on boundary surfaces – LOD construction based on local surface coarsening/simplification operations – LOD construction unaware of visibility (view-independent approximations)• Hard to apply to models with high detail and complex topology and high depth complexity!
    • F. Marton– CRS4/Visual Computing, October 2012Overcoming limitations of localmesh refinement techniques• Tight integration of visibility and LOD construction – Multi-scale modeling of appearance rather than geometry – Volume-based rather than surface-based
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Specialize Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi- Triangulations Gobbetti/Marton (CRS4), Adaptive Tetrapuzzles – Dense meshes Cignoni/ Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno SIGGRAPH 2004 (CNR) IEEE Viz 2005 Layered Point Clouds – Dense clouds Generalize Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Specialize Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 View-dep. Blockmaps – Hybrid volumetric city model Volumetric Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) Model EG 2007 In progress MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008 Generalize
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsHandling Huge Complex 3D models• General purpose technique that targets many model kinds• Underlying ideas – Multi-scale modeling of appearance rather than geometry – Volume-based rather than surface-based – Tight integration of visibility and LOD construction – GPU accelerated (programmabilty + batching)
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsThe Far Voxel Concept• Assumption: opaque surfaces, non participating medium• Goal is to represent the appearance of complex far geometry – Near geometry can be represented at full resolution• Idea is to discretize a model into many small volumes located in the neighborood of surfaces – Approximates how a small subvolume of the model reflects the incoming light=> View-dependent cubical voxel
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsThe Far Voxel Concept• Assumption: opaque surfaces, non participating medium• Goal is to represent the appearance of complex far geometry – Near geometry can be represented at full resolution• Idea is to discretize a model into many small volumes located in the neighborhood of surfaces – Approximates how a small subvolume of the model reflects the incoming light=> View-dependent voxel
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsThe Far Voxel Concept• A far voxel returns color attenuation given – View direction – Light direction• Rendered using a Shader = f (view direction, light direction) customized vertex shader executed on the GPU
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Inner nodes• Sample a model subvolume D min to build a grid of far voxels θ max• Voxels are far – Project to worst case θmax – Viewed not closer than dmin Section of the 3D grid of far voxels
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Inner nodes• Sample a model subvolume D min to build a grid of far voxels θ max• Voxels are far – Project to worst case θmax – Viewed not closer than dmin• Raycasting samples original model and identifies visible voxels Section of the 3D grid of far voxels
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Inner nodes• Sample a model subvolume D min to build a grid of far voxels θ max• Voxels are far – Project to worst case θmax – Viewed not closer than dmin• Raycasting samples original model and identifies visible voxels Section of the 3D grid of far voxels
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Object SpaceOcclusion• Environment occlusion D min θ max• Cull interior part of grid X X of far voxels Section of the 3D grid of far voxels
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Object SpaceOcclusion• Environment occlusion D min θ max• Cull interior part of grid X X of far voxels Section of the 3D grid of far voxels
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Object SpaceOcclusion• Environment occlusion D min θ max• Cull interior part of grid X X of far voxels• Culls 40% of the high depth complexity Boeing 777 model, • worst case θmax = 0.5 deg (~10 pixel tolerance for 1024x1024 viewport using 50deg FOV) Section of the 3D grid of far voxels• Minimize artifacts due to leaking of occluded parts of different colors
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Far Voxel• Consider voxel subvolume• Samples gathered from unoccluded directions – Sample: • (BRDF, n) = f(view direction)
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Far Voxel• Consider voxel subvolume• Samples gathered from unoccluded directions – Sample: • (BRDF, n) = f(view direction)• Compress shading information by fitting samples to a compact analytical representation
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsConstruction overview: Far Voxel Shaders• Build all the K different far Flat proxy: voxels representations 2 components – K = flat, smooth.. – Principal component analysis Smooth proxy:• Evaluate each representation 6 components error – Compare real values (samples) … with the voxel approximations Others… from the sample directionErr(k) =• Choose approximation with lowest error
    • F. Marton– CRS4/Visual Computing, October 2012 Far Voxels Rendering• Hierarchical traversal with coherent culling – Stop when out-of view, occluded (GPU feedback), or accurate enough• Leaf node: Triangle rendering – Draw the precomputed triangle strip• Inner node: Voxel rendering – For each far voxel type • Enable its shader • Draw all its view dependent primitives using glDrawArrays – Splat voxels as antialiased point primitives – Limits • Does not consider primitive opacity • Rendering quality similar to one-pass point splat Triangles methods (no sorting/blending) Far Voxels
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsResults• Tested on extremely complex heterogeneous surface models – St.Matthew, Boeing 777, Richtmyer Meshkov isosurf., all at once• Tested in a number of situations – Single processor / cluster construction – Workstation viewing, large scale display373M triangles 350M triangles 472M triangles 1.2G triangles14.5 GB 13.7 GB 18.4 GB 46.6 GB
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsResults• 1-16 Athlon 2200+ CPU, 3 x 70GB ATA 133 Disk (IDE+NFS)• 1-20K triangles/sec – Scales well, limited by slow disk I/O for large meshes – Slow!! (but similar to recent adaptive tessellation methods)• Avg. triangles per leaf 5K• Avg. voxels per inner node 2.5K5h18m (16 CPU) 6h51m (16 CPU) 8h06m (16 CPU)10.6 GB 14.9 GB 16.1 GB 41.6 GB
    • F. Marton– CRS4/Visual Computing, October 2012Far VoxelsResults• Xeon 2.4GHz, 70GB SCSI 320 Disk, GeForce FX6800GT AGP 8x• Window size: from video resolution to stereo projector display – St.Matthew, Boeing, Isosurface: 640 x 480 – All at once: 640 x 480 and Stereo 2 x 1024 x 768• Pixel tolerance: [Target 1 | Actual ~0.9 | Max ~10]• Resident set size limited to ~200 MB 640 x 480 20 Fps 42 MPrim/s 2 x 1024 x 76845 Fps 44 Fps 34 Fps 20 Fps51 MPrim/s 42 MPrim/s 41 MPrim/s 40 MPrim/s
    • F. Marton– CRS4/Visual Computing, October 2012 Far Voxels Conclusions• General purpose technique that targets many model kinds – Seamless integration of • multiresolution • occlusion culling • out-of-core data management – High performance – Scalability• Main limitations – Slow preprocessing – Non-photorealistic rendering quality Intel Xeon 2.4GHz 1GB, GeForce 6800GT AGP8X
    • F. Marton– CRS4/Visual Computing, October 2012 Far Voxels Conclusions• General purpose technique that targets many model kinds – Seamless integration of • multiresolution • occlusion culling • out-of-core data management – High performance – Scalability• Main limitations – Slow preprocessing – Non-photorealistic rendering quality Intel Xeon 2.4GHz 1GB, GeForce 6800GT AGP8X
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Specialize Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi- Triangulations Gobbetti/Marton (CRS4), Adaptive Tetrapuzzles – Dense meshes Cignoni/ Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno SIGGRAPH 2004 (CNR) IEEE Viz 2005 Layered Point Clouds – Dense clouds Generalize Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Specialize Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 View-dep. Blockmaps – Hybrid volumetric city model Volumetric Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) Model EG 2007 In progress MOVR – COVRA Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008 Generalize
    • www.crs4.it/vic/ Recent Advances in Massive Volume Visualization
    • F. Marton– CRS4/Visual Computing, October 2012IntroductionGoal• Visualization of massive scalar volumes without size limitations – A single-pass raycasting technique working out-of-core on GPU parallel architectures• Compress data to facilitate data streaming and 4D visualizations – Novel compression architecture and novel compression methods 56
    • F. Marton– CRS4/Visual Computing, October 2012IntroductionTeaser Compression-domain adaptive volume rendering based on sparse representation of voxel blocks. NVIDIA GTX 560 57
    • F. Marton– CRS4/Visual Computing, October 2012The Visual Computer 2008 & 2010MOVR: A single-pass raycasting single-technique working out-of-core on out-of-GPU parallel architectures 58
    • F. Marton– CRS4/Visual Computing, October 2012Massive Volumes VisualizationVolume rendering problem Early ray termination Accumulation Pixel Empty space skipping Order independent Order dependent 59
    • F. Marton– CRS4/Visual Computing, October 2012 Massive Volumes Visualization Volume rendering problem• Current interactive solutions are based on GPU architectures – Massive parallelism – Huge memory bandwidth• E.g. GeForce GTX 580 – has a 192.4 GB/s of bandwidth – Has 1581.1 GFLOPs [ hardwareinsight.com ] 60
    • F. Marton– CRS4/Visual Computing, October 2012Massive Volumes VisualizationRelated work. Moderately sized volumes• Current high quality solutions based on GPUs implementing … – Slice-based methods – Ray casting techniques [ Li et al, 2003 ]• The full volume must fit on GPU memory [ Krüger et al., 2003 ] 61
    • F. Marton– CRS4/Visual Computing, October 2012 Massive Volumes Visualization Contribution to the state-of-the-art state-of-the- • Multiresolution out-of-core Volume Renderer – Preprocessing • build multiresolution octree of volume bricks – Rendering: • Adaptive CPU loading of the data from local/remote repository cooperates with separate render thread fully executed in the GPU • Stackless traversal of an adaptive working set • Exploitation of the visibility feedbackE. Gobbetti, F. Marton, and J. A. Iglesias Guitián. J. A. Iglesias Guitián, E. Gobbetti and F. MartonA single-pass GPU ray casting framework for interactive View-dependent exploration of massive volumetricout-of-core rendering of massive volumetric datasets. models on large-scale light field displays.The Visual Computer, 24, 2008. The Visual Computer, 26, 2010. 62
    • F. Marton– CRS4/Visual Computing, October 2012 Massive Volumes Visualization Contribution to the state-of-the-art state-of-the-• Use CPU for … – Creation & loading – Octree refinement – Encode current cut using an spatial index• Use GPU for … Architecture overview – Stackless octree traversal • Using neighbour pointers – Rendering • Flexible ray traversal / compositing strategies • Improved visibility feedback Neighbour pointer navigation 63
    • F. Marton– CRS4/Visual Computing, October 2012Massive Volumes VisualizationMethod overview [ creation and maintainance ] [ rendering ] preprocessing adaptive loader offline octree refinement visibility feedback has current working set no enough accuracy? storage yes octree node volume database prepare to render render CPU GPU 64
    • F. Marton– CRS4/Visual Computing, October 2012Massive Volumes VisualizationVisibility feedback • Working set reduction – Opaque 1731 -> 1035 bricks – Transp. 1984 -> 1789 bricks • Rendered on window size 1024x576 65
    • F. Marton– CRS4/Visual Computing, October 2012Massive Volumes VisualizationResults (2/2)Interactive exploration ofa 16bit 2GB CT volume ona consumer NVidia 8800GTS graphics board with640MB (2008) 66
    • F. Marton– CRS4/Visual Computing, October 2012Compression – DomainVolume Rendering • 60 Time steps of the 432^3 supernova dataset 67
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionIntroduction• Limited bandwidth and memory => – LOD (MOVR) – Compression• Compression is fully exploited if data is maintained in compressed form through the entire pipe-line – Compression-domain volume renderers + deferred filtering• Highly asymmetric encoding/decoding schemes – We can afford slow offline compression and precomputation – Fast real-time data decoding, interpolation and shading – Spatially independent random-access to data 68
    • F. Marton– CRS4/Visual Computing, October 2012State-of-the-State-of-the-art• CPU decompression – Do not limit bandwidth and memory • [Ning & Hesselink, 92] and many others... • [Gobbetti et al. 08, Iglesias et al. 10]• Hardware based – E.g. S3TC [Brown], NVidia VTC [Craighead] – Full random access – Limited compression• GPU decompression – Full working set GPU decompression • Tensor Approximation [Suter et al.2010] • Do not limit memory • Limit Bandwidth – Partial working set • Limit both memory and bandwitdh
    • F. Marton– CRS4/Visual Computing, October 2012Tensor Approximation(CRS4 & UZH 2010)• Multiresolution• Brick Based• Extract dominant data features• Real Time GPU Reconstruction – Full Working set• Bandwidth optimization• Memory Consumption S. Suter, J. A. Iglesias Guitián, F.Marton, M. Agus, A. Elsener, C. Zollikofer, M. Gopi, E. Gobbetti, and R. Pajarola. Interactive Multiscale Tensor Reconstruction for Multiresolution Volume Visualization. In: IEEE Transactions on Visualization and Computer Graphics, pp. 2135–2143, vol 17, 2011
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionContribution to the state-of-the-art state-of-the-• COVRA: Compression-domain Output-sensitive Volume Rendering Architecture – Novel architecture w/ parameterized cache behaviour – Supports and extend state-of-the-art compression methods• ☺ Efficient multisampling (HQ shading)• ☺ No perspective limitations• ☺ Fully adaptive multiresolution approach• ☺ Multipass working set decompression• ☺ High compression ratios and signal quality J. A. Iglesias Guitián, F.Marton and E. Gobbetti. COVRA: a Compression Domain Output-Sensitive Volume Rendering Architecture based on sparse representation of voxel blocks In: proceedings of Eurovis 2012
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCOVRA: Overview• Main concepts: – Preprocessor builds multiresolution octree of compressed nodes – Data travel in compressed format until last stage. – Fully adaptive Rendering – Highly integrated decompression / rendering supporting high quality filtering and shading 72
    • F. Marton– CRS4/Visual Computing, October 2012Run-Run-timeCOVRA: Subtree management• Three rendering steps: 1. CPU multiresolution octree Adaptive refinement 2. Partitioning of the octree into a set of subtrees • Use GPU decompressed cache size as constraint • Front-to-back order decided at real- time during the octree traversal 3. Subtree decompression, raycasting and compositing • Decompress to temporary buffer or available GPU cache • Raycast decompressed octree nodes • Compose with previous results Framebuffer 73
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionSparse coding of volume blocks• Each multiresolution octree node decomposed in blocks.• Each block, made of few^3 voxels, is compressed Single octree node containing overlapping information Compressed block• Each block represented by a sparse linear combination of few dictionary elements – Data specific representation – Compression is achieved by storing indices and magnitudes 74
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionSparse coding of volume blocks• Generalization of vector quantization – Combine vectors instead of choosing single ones – Overcomes limitations due to dictionary sizes• Generalization of data-specific bases – Dictionary is an overcomplete basis – Sparse projection• Encoding in two steps – Training: Find data specific dictionary – Sparse coding: Find best representation of each block using linear combination of dictionary elements under sparsity constraint • We employ ORMP via Choleski Decomposition 75
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionFinding an optimal dictionary• We employ the K-SVD algorithm for dictionary training – Algorithm for designing overcomplete dictionaries for sparse representations [Aharon et al. 06]• But running K-SVD calculations directly on massive volumes would be unfeasible, therefore … – … we applied the concept of coreset [Agarwal et al. 05] to smartly subsample and reweight the original training set [Feldman & Langberg 11, Feigin et al. 11] 76
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionDictionary learning (K-SVD) (K-• K-SVD can be seen as a K-Means generalization• Basic steps: – Sparse coding of signals in X, producing Γ – Update dictionary atoms given the sparse representations • Optimize one atom at a time, keeping the rest fixed • The size of E is proportional to the number of training signals – As in [Rubinstein et al. 08] we replace the SVD computation with a simpler numerical approximation 77
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCoreset construction• Calculations on massive input volumes are still unfeasible, but we can … – … reduce the amount of data used for training – … use importance sampling• We associate an importance to each of the original blocks, being the standard deviation of the entries in – Picking C elements with probability proportional to – More important blocks should finish in our coreset 78
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCoreset construction• Non-uniform sampling introduces a severe bias – Scale each selected block by a weight where is the associated probability – Applying K-SVD to scaled coefficients will converge to a dictionary associated with the original problem• Coreset scalability 79
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCOVRA: Results• PSNR vs. Bits Per Sample 80
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCOVRA: Results• Comparison against state-of-the-art GPU-based decompression methods
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCOVRA: Results 82
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCOVRA: Results• Gradient mapped to RGB color 83
    • F. Marton– CRS4/Visual Computing, October 2012Volume CompressionCOVRA: Video Compression-domain adaptive volume rendering based on sparse representation of voxel blocks. NVIDIA GTX 560. (2012) 84
    • F. Marton– CRS4/Visual Computing, October 2012Summary and ConclusionsSummary • Improved the scalability of state-of-the-art volume rendering techniques – MOVR: a novel single-pass GPU ray casting framework supporting a flexible ray traversal and incorporating visibility feedback for interactive exploration of large volumes without size limitations • Improved compression and streaming of large and time-varying volumes – COVRA: Proposed a novel compression-domain architecture, supporting state-of-the-art compression methods, random-access to compressed data and HQ shading – A novel compression method for massive volumes based on sparse- coding (K-SVD) and coreset training sets 85
    • F. Marton– CRS4/Visual Computing, October 2012Our contributionsGPU-GPU-friendly output-sensitive techniques output- Impossibile v isualizzare limmagine. La memoria del computer potrebbe essere insufficiente per aprire limmagine oppure limmagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare limmagine e inserirla di nuov o. *-BDAM – Local and Global Terrain Models Specialize Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi- Triangulations Gobbetti/Marton (CRS4), Adaptive Tetrapuzzles – Dense meshes Cignoni/ Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno SIGGRAPH 2004 (CNR) IEEE Viz 2005 Layered Point Clouds – Dense clouds Generalize Gobbetti/Marton (CRS4) SPBG 2004 / Computers & Graphics 2004 Specialize Far Voxels – General Gobbetti/Marton (CRS4) SIGGRAPH 2005 View-dep. Blockmaps – Hybrid volumetric city model Volumetric Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR) Model EG 2007 In progress MOVR – COVRA Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4) CGI 2008 Generalize
    • F. Marton– CRS4/Visual Computing, October 2012A real-time data filtering problem! real-• Models of unbounded complexity on limited computers – Need for output-sensitive techniques (O(N), not O(K)) →∞) • We assume less data on screen (N) than in model (K →∞ – Need for memory-efficient techniques (maximize cache hits!) – Need for parallel techniques (maximize CPU/GPU core usage) View parameters Storage Screen Small I/O Working Set Projection + Visibility + Shading Limited bandwidth (network/disk/RAM/CPU/PCIe/GPU/…) O(K=unbounded) bytes 10-100 Hz (triangles, points, …) O(N=1M-100M) pixels
    • F. Marton– CRS4/Visual Computing, October 2012A real-time data filtering problem! real-• Models of unbounded complexity on limited computers – Need for output-sensitive techniques (O(N), not O(K)) →∞) • We assume less data on screen (N) than in model (K →∞ – Need for memory-efficient techniques (maximize cache hits!) – Need for parallel techniques (maximize CPU/GPU core usage) View parameters Storage Screen Small I/O Working Set Projection + Visibility + Shading Limited bandwidth (network/disk/RAM/CPU/PCIe/GPU/…) O(K=unbounded) bytes 10-100 Hz (triangles, points, …) O(N=1M-100M) pixels
    • F. Marton– CRS4/Visual Computing, October 2012 THANK YOU! Questions and Answers Next SessionTechnologies for improving real-time real- immersive exploration of massive (volumetric) models. presented by Marco Agus