Computer Vision for PS3 Games

Computer Vision for
PS3 Games
Richard Marks
SCE US R&D

My Background
 School
 Avionics, robotics, control theory
 ARL ! (Aerospace Robotics Lab)
 PlayStation R&D
 Created computer vision library for PS2
 Video filters; format conversion; logical, arithmetic, and
morphological operations; matching; moments
 Highly optimized using SIMD multimedia instructions
 Extensive pre-fetching, pipe balancing
 Provided library source and lots of sample code
 Worked with London studio to make game prototypes
 Specified EyeToy hardware and wrote initial driver
 Cell experience starting 2004

Computer Vision for PS3 Games
 PS3 Introduction
 Video input to PS3
 PLAYSTATION Eye
 Released games
 PS3 Vision SDK
 Current research topics
 Head tracking
 Color tracking
 Sketch analysis

PS3 Introduction
 Hardware
 1 PPE, 7 SPE (6 available to game)
 256 MB main memory, 256 MB graphics memory
 RSX graphics chip (Nvidia)
 USB 2.0
 Considerations
 RSX, PPE are heavily utilized by games
 Main memory is precious
 SPEs are under-utilized
 SPURS tasks or jobs are encouraged

PS3 live video input
 USB 2.0
 libcamera
 part of PS3 system software
 Implements simple driver model
 Open, Start, Read, Stop, Close
 Set/Get Attribute (e.g. gain, exposure, red/blue/green
gains, AGC flag, mirror flag, LED flag, etc.)
 Read copies most recent complete frame from
system memory to application memory
 Supports UVC, PS Eye, EyeToy
 Cameras are asynchronous to PS3 display

PLAYSTATION Eye (beyond EyeToy)
 Uncompressed Video
 No artifacts
 Software demosaicking
 Low CPU overhead
 Increased Sensitivity
 Low-light/No-light operation
 Shortened exposure times
 Lower visual noise
 Faster Frame Rate
 Quicker response
 Smaller tracking search regions
 More temporal information
 Dual Fixed Field of View
 Standard-angle, proven by EyeToy
 Wide-angle for full-body apps
 No focus adjustment needed
 Higher Resolution
 Lower pixelation effects
 Improved definition
 Better statistical behavior
 Improved voice input
 Unencumbered speech recognition
 Audio chat in noisy environment
 Echo location tracking

PS Eye Specification
 Cost similar to EyeToy
 56/75 degree dual-FOV lens
 <1% distortion, fixed focus (0.5m to 10m)
 ¼” CMOS sensor
 6 micron pixels
 640x480 Bayer at 60 frames/sec, 320x240 at 120 frames/sec
 640x480 YUV422 at 30 frames/sec, 320x240 at 60 frames/sec
 10-bit dynamic range, or 8-bit with gamma curve
 rolling shutter (no frame buffer)
 USB2/compression chip
 bulk transfer (low CPU overhead)
 optional JPEG compression
 Omni-directional 4-microphone linear array

Typical PS Eye video processing
1. Read 640x480 Bayer pattern, 60 frames/sec
2. Stuck pixel removal (calibrated or uncalibrated)
3. Bayer to RGB (demosaicking)
4. RGB to RGB’ (color correction)
5. RGB’ to YUV (color space conversion)
 (steps 2-5 use <2ms on 1 SPE)
-or-
 Read 640x480 YUV422, 30 frames/sec

Eye of Judgment
 Augmented reality card game (video)
 Uses modified version of Sony Cybercode
 Green markers provide card detection and
homography transform
 2-D barcode provides card identification

London Studio titles
 EyeCreate (free)
 Movie editing with effects
 Aquatopia, Operation Creature Feature,
Mesmerize, Tori-Emaki, Towers of Topoq
 Motion detection
 Feature tracking (similar to Lucas-Kanade, video)

Other PS3 apps that use a camera
 PS3 built-in A/V chat
 Burnout Paradise
 Snapshots at significant game moments (similar to
roller-coaster photos)
 Singstar
 Make your own music video

PS3 Vision SDK
libcamera
sys_audio
libvision
Game
libvision
vision
tasks
vision
jobs
libspurs
SPE
PPE

PS3 Vision SDK (SPE)
 SPE libvision function library
 Video filters; format conversion; logical, arithmetic, and morphological
operations; matching; moments
 Completely internal to SPE (no DMA, etc)
 cellSlice (8, 16, RGBA, Float)
 Note: some tasks cannot break images into slices (e.g. rotate)
 Optimized (SIMD, unrolled, pipelined), no assembly (only intrinsics)
 SPE tasks
 Typically call SPE libvision functions for cellSlices
 Handle DMA, double-buffering
 VisionTaskInfo, cellImage
 SPE jobs
 Call SPE libvision functions
 DMA set up in advance for slices, handled by SPURS
 Some things are hard to break up into jobs

PS3 Vision SDK (PPE)
 PPE libvision
 PPE versions of all SPE libvision functions using
#include <spu2vmx.h>
 cellImage
 cellImage8, cellImage16, cellImageRGBA, cellImageFloat
 VisionTaskSet
 Task synchronization
 Run-time reloadable SPE code
 Easy SPE task execution using similar calling model to PPE

PS3 Vision SDK
 Sample code
 Mostly use 1 SPE
 RSX shader renders YUV directly and boosts
saturation
 Easy to switch between SPE and PPE execution
 Not yet released to 3rd
party game developers
 Maintenance concerns
 Unhappy with design

Head Tracking
 Face detector from Sony Corporation
 Detects multiple faces at various scales, rotations
 Robust, but not smooth
 Face tracking
 Based on template (patch) matching
 Correlation of images filtered with signum of Laplacian of
Gaussian (sLoG)
 16x16 patches for 320x240 video
 Multiple templates allow different rotations and scales
 Uses motion patch tracking if template match fails
 Face detector directs face tracker search area

Color tracking
 Segmentation every frame
 Not really tracking at all
 Chrominance smoothing, thresholding
 Repeated windowed centroid/area calculation to
reject noise
 X, Y from centroid, Z from area (for sphere)
 Second moments provide principal axes
 In bad lighting, principal moment better for Z than area
 Video or demo

Sketch Analysis
 Image processing
 Sketches: edge detection
 Objects: background subtraction
 Segmentation (region finding)
 Find closed contours by walking around edges
 Vectorization (regions to polygons)
 Adjustable
 Texture lifted from original image
 Machination (polygon to game object)
 Physical objects
 Game-specific objects (e.g. tank, lunar lander, etc.)

Sketch Analysis Key Factors
 Known camera (PS Eye)
 Known camera position (Eye of Judgment stand)
 Known surface (white paper)
 Good lighting situation (paper faces up at lights,
camera looking down)
 High contrast naturally provided by user
 Shadows are problematic, but addressable
 video

Questions?
 Thank you!
 richard_marks@playstation.sony.com

Fundamental Issue: Lighting
 Unknown lighting environment limits robustness
 Variable lighting leads to variable performance
 Users often do not understand “good lighting”, or
they cannot easily accomplish it
 Simple auto gain, exposure are insufficient
 Positive methods give too many false negatives

My Background
 School
 Avionics, robotics, dynamics, control theory
 Embedded systems
 ARL (Aerospace Robotics Lab)
 “Visual Sensing for Automatic Control of an Underwater Robot”
 Automatic station-keeping, mosaic creation, and navigation
 Teleos Research (acquired by Autodesk)
 Real-time optical flow and stereo
 PeopleTracker® for Canon video conferencing camera
 Semi-automatic 3d modeling from photos

Computer Vision for PS3 Games

More Related Content

What's hot

Similar to Computer Vision for PS3 Games

More from Slide_N

Recently uploaded

Computer Vision for PS3 Games