presentation accompanying the paper "Realizing OpenGL: Two Implementations of One Architecture" at the 1997 SIGGRAPH/Eurographics Workshop on Graphics Hardware held in Los Angeles, California over August 3-4, 1997
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Realizing OpenGL
1. OpenGL as Visualization Architecture
Realizing OpenGL: o Standard dataflow for hardware
Two Implementations of amenable graphics & imaging operations.
One Architecture o OpenGL can serve as a clarifying
architecture for the design of interactive
graphics hardware.
Mark J. Kilgard o Key technical question: Is OpenGL an
Silicon Graphics, Inc. adaptable architecture.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
Approach OpenGL as Graphics & Imaging Dataflow
Feedback/
Selection
o Briefly examine OpenGL’s architectural
philosophy and approach. Geometry
Unpack
Vertexes
Vertex
Operations
Point, Line,
and Polygon
Rasterization
Framebuffer
o How do real implementations manifest
OpenGL and realize its adapability?
Display Texture Fragment
Command Memory Operations
+ SGI’s InfiniteReality, a ‘‘hardware Tokens
Lists
intensive’’ manifestation of OpenGL.
+ SGI’s O2 digital media workstation. Unpack/
Pixel Image
Pixels Pack
Pixels Operations Rasterization
o Same architecture, different goals.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
2. OpenGL Dataflow Observations Architectural Observations about OpenGL
o Parallel ‘‘geometry’’ and ‘‘imaging’’ paths. o Window system & OS independent.
o Common fragment processing for o Client / server style interface.
geometry & imaging paths.
o Limited functionality; limits itself to
o Texture memory ‘‘unites’’ geometry and 3D graphics & imaging operations.
imaging paths.
o Data format rich.
o Display lists cache write−once, execute−
many OpenGL command sequences. o Configurable, not programmable.
o Readback of images possible. o Orthogonal functionality.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
OpenGL Extensibility Example of Extensibility: Imaging
RGBA Index
o Standard & vendor specific extensions. glPixelTransfer Scale
and Bias
Shift
and Add Original
OpenGL 1.0
Pixel Path
glPixelMap Pixel Mapping Pixel Mapping
RGBA RGBA Index RGBA
o Examples: glColorTableEXT
glEnable/glDisable
Color Table
glConvolutionParameterEXT
glEnable/glDisable
3D texture mapping (volume rendering) OpenGL’s glPixelTransfer Convolution
Scale & Bias
extended
convolution & histogram (imaging) pixel transfer
glColorTableEXT
glEnable/glDisable
glColorMatrixSGI
Color Table Post Convolution
improved texture filtering dataflow glEnable/glDisable
glPixelTransfer Color Matrix
multisample antialiasing glColorTableEXT
glEnable/glDisable
Scale & Bias
scores more implemented glHistogramEXT
glResetHistogramEXT
Color Table Post Color Matrix
glEnable/glDisable
Histogram
glMinmaxEXT
o Mechanism for innovation. glResetMinmaxEXT
glEnable/glDisable
MinMax
RGBA
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
3. InfiniteReality InfiniteReality’s OpenGL Approach
o SGI’s current high−end fastest graphics. o Totally conformant.
First high−end SGI full OpenGL design.
o Hardware intensive.
o Designed for ‘‘real−time’’ graphics, ie.
30 or 60 Hertz constant frame rates. o OpenGL dataflow manifested in hardware.
o Designed application domains: o World’s best OpenGL performance.
Visual simulation
High−end Imaging (Electronic Light Table) o Geometry & imaging reuse transform
Volume rendering & scientific visualization hardware.
Large scale Computer Aided Design
Film & video editing o Use extensions to innovate beyond core.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
InfiniteReality System Overview InfiniteReality full ASIC−level view
Image Video
Readback BUS Memory RAMDAC
o 12 custom ASICs. Transform/Rasterization
Crossbar Texture
Fragment
Memory
Image
Processor
Image
Processor
Output
Channel #1
Pixel Memory
Processor Processor
Generator Image
Memory Video
o 3, 4, or 6 boards; 3 board types (same as Transform
Texture
Fragment
Processor
Processor Image
Image
Image
Memory
Memory
Processor
Processor
Output
Channel #0
RAMDAC
RealityEngine predecessor). Engine Texture
Generator Texture
Fragment
Memory
Image
Processor
Memory
Processor
Transform Processor Image
Engine Memory
Host Transform Back ProcessorImage
End Memory
Image Video
Analog Video Output
Interface Engine Texture Processor Display
Processor FIFO Memory
Image Requestor
Distributor
Raster Transform
Engine
Fragment
Processor Image
Processor
Memory
Processor Video
Function
Manager
Manager Texture
Memory
Memory
Processor
Image
Requestor
Channels
Texture
Interface
Transform
Manager
Memory
MemoryImage Video Video
Raster Display Engine
Texture
Manager
Processor Memory
Image BUS Requestor
Host
Memory ImageProcessor
Memory
Transform Manager
Texture Texture
Manager
Memory Memory Memory
Processor
Video
Manager Generator Texture
Manager Manager
Memory
Texture
Image
Processor
Memory Requestor
Manager Processor
Raster Memory
Texture
Manager
Image
Memory
Processor
Manager Memory
Manager
Image
Memory
Processor
Raster
Manager
Transform Manager board A single Raster Manager board set Display Generator board
(2 or 4 TEs) (1, 2, or 4 RMs per pipe) (option for 8 channels)
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
4. InfiniteReality: Transform Manager Board Transform Manager Board
Readback BUS o Hardware resident OpenGL display lists.
o Input & output DMA for large transfers
Transform/Rasterization
Transform
Engine (pull & push models).
Transform
Engine o Front−end ASIC has RISC core & 16 MB.
Crossbar
Host Back
Interface Transform End
Processor Engine Transform FIFO
Distributor Engine
o 2 or 4 Transform Engines (3 SIMD cores,
Transform
Engine 540 Mflops) running MIMD.
o Feed into 4 MB FIFO to crossbar.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
Transform Manager Board Functionality InfiniteReality: Raster Manager Board
Readback BUS Image
o OpenGL command stream decode. (to Transform Image
Memory
Memory
Processor
Manager) Texture
Fragment Image
Processor
Pixel Memory
Processor Processor
Generator
o OpenGL state management. Image
Memory
Transform/Rasterization
Texture Processor Image
Fragment Image
Memory
Processor Memory
Processor
Image
Processor
Memory
Texture
o 3D primitive transformation, clipping, Generator Texture
Fragment
Image
Processor
Memory
Processor
Crossbar
Image
lighting, evaluators, texture coordinate Processor Memory
ProcessorImage
Memory
Image Video BUS
Texture
generation (in Transform Engine). Fragment
Processor Image
Processor
Memory
Image
Processor
Memory
Processor (to Display
Texture Memory Generator)
Processor
Memory Image
Texture
Manager MemoryImage
o Convolution, histogram, LUTs, color Memory
Texture
Manager
Memory
Texture Texture
Processor
Image
Memory
Image
Processor
Memory
matrix, etc. for pixel path (in Transform Manager
Memory Memory
Texture
Manager Manager
Memory
Texture
Memory
Processor
Processor
Image
Memory
Engine). Manager
Memory
Texture
Manager
Processor
Image
Memory
Processor
Memory Image
Manager Memory
Processor
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
5. Raster Manager Board Raster Manager Board Functionality
o 64 or 16 MB texture; 80 MB frame buffer. o Fully OpenGL compliant rasterization.
o Rasterize primitives into quad fragments. o Very sophisticated texture filtering.
o 8 Texture Managers fetch texels. o No penalty for texturing.
o 4 Tex Filters filter & distribute fragments. o 4 & 8 sample antialiasing.
o 80 (20 chips) Image Memory Processors. o Services video requests from Display
Generator.
o All hardwired logic; no microcode.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
InfiniteReality: Display Generator Board InfiniteReality manifests OpenGL dataflow
Transform
Engines & Texture
Video
Output RAMDAC Feedback/
Selection Back End Fragment
Channel #1
FIFO Processors
Point, Line,
Image
Unpack Vertex
Display
Function Video
Geometry
Vertexes Operations and Polygon
Rasterization Memory
Manager Output RAMDAC Processors
Channel #0
Command Display Texture Fragment
Tokens Lists Memory Operations
RGB
Video analog Host Pixel Texture
Video
BUS Requestor video Interface Generator Generator Framebuffer
Video output Processor & Texture
(to Raster Requestor
signals Image
Unpack
Pixels
Pixel
Operations
Image
Rasterization Memory
Manager Video
Managers
Requestor
boards) PAL/NTSC
Transform
Encoder Composite
Video
Requestor or S−Video Engine
Distributor
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
6. ~60 OpenGL Extensions Supported Feature: Virtual Textures
o Multisampling. o Want to render HUGE textures.
o 3D texture mapping.
o Texture sharpen & detail. o Example: Fly anywhere on earth with
o Accelerated offscreen rendering. satellite data supplied textured−terrain.
o Convolution.
o Histogram & Min/max. o Clipmapping allows this.
o Color matrix.
o Fog extensions. o Detailed region of interest of huge texture.
o Calligraphic light points.
o Shadow mapping. o Supports roaming of the region of
o New blending modes. Many more . . . interest under software control; app
pages terrain data from disk.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
Real−time OpenGL Clarified System Goals Same OpenGL Architecture, Different Goals
o High bandwidth to graphics; DMA & PIO. o SGI’s O2 is an OpenGL/multimedia
desktop Unix workstation.
o Required support for coarse grain
app−>cull−>draw partitioning on shared o $5000 to $10,000 system price.
memory multiprocessor. IRIS Performer.
o Multiple pipes per system (up to 8!). o Very tightly integrated; all vital Internet,
RealityMonster configuration uses graphics, & multimedia capabilities
compositing to boost performance. built into system.
o 64−bit RAID file system support.
o IRIX React: true real−time OS features.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
7. O2 System−level Architecture Innovation in Tight Integration
Imaging & R5000
Compression or R10000
Secondary
Cache
o SIMD media processor built in.
Engine CPU
o Build OpenGL rasterization engine
SysAD 64@100MHz directly into memory controller.
64@ 32@
Display
133MHz Memory &
Rendering
133MHz
IO o Unified Memory Architecture.
Display Engine Engine Controller PCI
144@133MHz
No distinct pools of memory; just one
SDMUX
aggregate bus and memory pool saves
288@66MHz
cost!
SDRAM
Main Memory
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
Custom Media Processor Memory Controller does OpenGL Rendering
o Custom media processor with 8x/16 o Memory controller does fully compliant
SIMD vector unit and R3000−like OpenGL primitive iteration, texturing, &
scalar unit. fragment operations. GTX−RD.
o Device−dependent OpenGL and o Graphics memory bandwidth demands
digital media libraries use processor. serviced by tight coupling with memory.
o Many extended OpenGL pixel path o Texture, stencil/depth, & color buffers
operations accelerated by processor. all accessed out of main memory pool.
o Media processor adds has bitstream (Contrast with various dedicated memory
entropy encoder for image compression. pools in InfiniteReality approach.)
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
8. SysAD 64@100MHz
O2 Memory & 100MHz Unified Memory Architecture
Rendering 66MHz
CPU/ICE Interface
Engine
64
o One 2.1 GB/second aggregate bus.
internals
64
64
64
Rendering Engine
Host Interface o One pool of memory for displayed frame
buffer, texture, media buffers, offscreen
Display IO
32@
rendering regions, ancillary buffers, and
256
64@
Engine
Interface
Pixel
Memory Memory
Engine
Interface
133MHz system memory.
133MHz Request Transfer
Pipeline
Unit Engine
o Potentially, any byte of main memory
can be used for any above purpose.
256
256
256
o Rendering into tiles; textures on tiles.
256
Memory Interface
Memory Data Bus 128@133MHz
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
Frame Buffer Tiling on O2 Digital Media Buffers
8−bit PseudoColor X Windows background
(no depth/stencil) o Rendering Engine, video input, and
imaging & compression engine all
manipulate single image buffer.
o Minimizes copies within system.
o Texture mapping possible directly
80 tiles (10x8) for 32−bit RGBA and 8−bit PseudoColor
40 tiles (5x5+3x4+3) for 32−bit RGBA back buffer from DM buffer contents.
25 tiles (5x5) for 24−bit depth + 8−bit stencil buffers
OpenGL Window OpenGL window
+ 145 tiles or 9.3 MBs versus o Very efficient, very flexible digital
Double buffered
32−bit RGBA
Double buffered
32−bit RGBA 15.4 MBs for a dedicated double buffered media capabilities co−exist with OpenGL.
with 24−bit Depth Buffer (no depth/stencil) 32−bit frame buffer with 24−bit depth and
8−bit stencil
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
9. O2 manifests OpenGL dataflow differently! OpenGL Clarified Multimedia System Goals
Feedback/
Selection
o Build one, fast bus.
Point, Line,
Geometry
Unpack
Vertexes
Vertex
Operations and Polygon
Rasterization
Rendering Engine o Build one memory pool. Graphics, system,
and multimedia resources all share one pool.
R5000 or
R10000
CPU o Rendering moves close to memory.
Display Texture Fragment
Command Lists Memory Operations
Tokens
o SIMD OpenGL imaging hardware can get
reused for video & image compression with
small hardware bitstream encoder addition.
Unpack/ Framebuffer
Pixels Pack Pixel Image
Pixels Operations Rasterization
o Off−screen accelerated rendering & video
Media
Processor texturing.
SiliconGraphics SiliconGraphics
Computer Systems Computer Systems
Conclusions
o OpenGL is an adapatable interactive
graphics hardware architecture.
o Different OpenGL implementations can
implement same architecture, but address
very different design goals.
o Also, OpenGL can clarify both graphics &
system issues.
o OpenGL architecture marks a maturing
of interactive graphics hardware field.
SiliconGraphics
Computer Systems