Lanman 3dim2009


Published on

Published in: Technology, Art & Photos
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Lanman 3dim2009

  1. 1. Shape from Depth Discontinuities under Orthographic Projection Douglas Lanman, Daniel Cabrini Hauagge, and Gabriel Taubin Division of Engineering, Brown University Providence, RI 02912 USA Abstract sparse viewpoints, Cipolla and Giblin describe a differential formulation, where an object is slowly rotated on a turntable We present a new method for reconstructing the 3-D sur- and a dense set of frames is recorded. The manner in which face of an opaque object from the motion of its depth dis- silhouettes deform between images is used to reconstruct continuities, when viewed under orthographic projection as surface points that are tangent to the camera viewing direc- the object undergoes rigid rotation on a turntable. A novel tion and located along silhouette boundaries. More recently, shape completion scheme is introduced to fill in gaps in Crispell et al. [7] used this algorithm to recover additional the recovered surface, which would otherwise be impossi- points located on general depth discontinuities, not just sil- ble to reconstruct from the motion of the depth discontinu- houettes, using the multi-flash imaging method proposed by ities alone. To verify the basic theory, we construct a large- Raskar et al. [19]. While locally convex points inside con- format orthographic multi-flash camera capable of recover- cavities can be estimated from these additional samples, lo- ing the depth discontinuities using controlled illumination. cally concave points at the bottom of concavities cannot be We analyze the limitations of multi-flash depth edge detec- recovered, leaving holes in the recovered surface. tion using orthographic imaging with both point sources In this paper we present a shape capture method, build- and directional illumination. We conclude by considering ing on prior systems, that is able to directly detect feature future applications for the shape completion scheme and the lines. We use an orthographic multi-flash camera to mea- specialized hardware introduced in this work. sure the motion of depth discontinuities as an object rotates on a turntable. We describe a new method to detect and fill local concavities, exploiting imaging symmetries unique to 1. Introduction orthographic projection. We note that our method is well- suited for industrial inspection and reverse engineering of Piecewise smooth surfaces can be used to describe manufactured parts containing numerous feature lines. the exterior boundary of solid objects, and have found widespread use in industrial design and manufacturing. 1.1. Contributions These descriptions are composed of smooth surface patches meeting along piecewise smooth boundary curves called The primary contributions of this paper are as follows. feature lines, across which surface normals can be discon- tinuous. A wide variety of scanning methods, generally re- • We present a new shape completion algorithm that can ferred to as shape-from-X, have been proposed to recover 3- be used to detect and fill local concavities in the surface D shapes. Many active methods (including structured light- recovered from the visual motion of depth discontinu- ing [20], laser striping [4], and photometric stereo [28]) and ities viewed under orthographic projection. passive methods (such as multi-view stereo [22]) recover • We analyze the properties of orthographic multi-flash points located on smooth surface patches, yet are unable cameras for depth edge detection, using either near- to directly sample feature lines. Instead, post-processing field point sources or directional illumination. must be used to detect feature lines from sampled point • We describe a calibration method for orthographic clouds [17]. Unfortunately, reconstructing feature lines in cameras using at least four images of a planar pattern this manner is technically impossible, since the correspond- augmented with a single point above its surface. ing space curves are not band-limited signals. Relatively few shape capture methods are tailored for di- • We present and analyze the performance of an ex- rect detection of feature lines. One notable exception is the perimental prototype, which is the first to exploit the shape-from-silhouette algorithm introduced by Cipolla and unique properties of orthographic multi-flash imaging Giblin [6]. While the visual hull [15] can be estimated from to reconstruct the 3-D shape of solid surfaces. 15502009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops978-1-4244-4441-0/09/$25.00 ©2009 IEEE
  2. 2. 2. Related Work 2.4. Curve and Surface Completion2.1. Epipolar-Plane Image Analysis The proposed surface reconstruction method cannot re- cover locally concave points in deep concavities. As a re- The study of image slices over time and the structures sult, a shape completion method is required to fill remain-observed in such imagery is considered in the prior work ing gaps. Shape completion has been extensively studiedon epipolar-plane images (EPI). One of the earliest stud- in 2-D and 3-D. Numerous 2-D curve completion schemesies was published by Bolles [5], in which he considers the have been proposed [25, 14, 13, 12]; generally, two positioncase of linear motion for parallel cameras. In this case, a and tangent constraints are specified. As an infinite num-single scene point maps to a line in the EPI, with a slope ber of curves satisfy such boundary conditions, additionalcorresponding to the distance of the point from the camera. constraints have been proposed to obtain a unique solution.Lines corresponding to points closer to the camera overlap Ullman [25] proposes a curve of minimum total curvaturethose for points further away, allowing reconstruction with- formed by two circular arcs, tangent at both ends, meet-out explicit feature matching [22]. This model is extended ing in the center. Horn [12] further analyzes the curvatureby Baker and Bolles [2] to deal with non-parallel cameras. energy when the number of circular arcs increases, provingFeldmann et al. [10] describe the properties of EPI curves that the internal energy is smaller than that of an Euler spiralfor a circular motion path due to camera rotation. Their or a simple circle. Knuth [14] proposes a set of propertiesparameterized curves cannot be applied to our system, as for visually appealing and scalable curves, arguing that cu-they model texture features rather than depth discontinu- bic splines possess the desired properties. Kimia et al. [13]ities. Apostoloff and Fitzgibbon [1] detect T-junctions for propose minimizing variation of curvature, yielding com-general camera motion. In a closely related work, Berent pletions based on the Euler spiral. In 3-D, diffusion-basedand Dragotti [3] achieve scene segmentation by extracting mesh inpainting [9, 26] has been proposed, where the gapslevel sets from a collection of epipolar-plane images. are filled inwards from the border while preserving some measure of smoothness. Pauly et al. [18] complete meshes2.2. Optical Shape Capture by detecting and replicating repeated patterns. In a closely- related work, Crispell et al. [7] fill gaps using the implicit In this work we propose a shape capture method inspired surface defined by an oriented point cloud. Finally, Cur-by the work of Crispell et al. [7]. In contrast to photometric less and Levoy [8] propose a volumetric method for fittingstereo [28], in which lights are placed far from a camera, a signed distance function to a set of range images.Raskar et al. [19] propose placing light sources close to thecenter of projection to estimate the set of visible depth dis-continuities. Crispell et al. [7] show that such multi-flash 3. Properties of Depth Discontinuities undercameras can be used to measure the visual motion of depth Orthographic Projectiondiscontinuities as an object undergoes rigid rotation, allow- The properties of apparent contours under perspectiveing surface reconstruction using the differential method of projection have been used to recover 3-D shapes fromCipolla and Giblin [6]. Unlike these systems, in which a turntable sequences using silhouettes [6] and depth discon-single perspective camera is used, we study the visual mo- tinuities [7]. In this paper we extend these results to thetion of depth discontinuities under orthographic projection. case of orthographic imaging. Specifically, we consider the scenario in which an object undergoes rigid rotation about a2.3. Orthographic Imaging and Illumination fixed axis, such that the rotation axis lies in a plane parallel Orthographic projection can be achieved using telecen- to image (see Figure 5). In this situation, a point p locatedtric lenses and has found widespread use in the machine on a depth discontinuity can be represented as p = q + λv,vision community due to its lack of perspective distor- where q is the orthographic projection of the point onto thetion when inspecting machined parts. As demonstrated by image plane along the tangent ray v, and λ is the scale fac-Watanabe and Nayar [27], a telecentric lens can be fash- tor corresponding to scene depth. Cipolla and Giblin [6]ioned from a conventional lens by placing an aperture at show that the 3-D surface can be recovered such thata specific location (e.g., at a focal point for a thin lens). −nT q˙Orthographic imaging conditions are also typical in remote λ= , for nT v = 0, ˙ (1) nT v ˙surveillance, when the camera is sufficiently far from thescene. Similarly, the properties of orthographic illumination where n is the surface normal at p, and q and v are the ˙ ˙(e.g., directional light sources) have been applied for scene derivatives with respect to the turntable angle θ. In thisunderstanding [16]. In this paper, we propose a method for section, we demonstrate that (1) can be used to reconstructcalibrating telecentric lenses, and describe the extension of both visible and hidden surfaces, using unique properties ofmulti-flash imaging to the orthographic case. depth discontinuities under orthographic projection. 1551
  3. 3. Figure 1. Depth discontinuities under orthographic projection. (a)A simple 2-D shape with one concavity is rotated 360 degreesabout a fixed axis. (b) The corresponding epipolar-plane imagewith the set of visible depth edges shown in black. (c) Hiddendepth discontinuities are included in the EPI. (d) The first 180 de- Figure 2. Different shapes that produce the same depth edges ingrees of B. (e) The second 180 degrees of B, (f) E flipped verti- the epipolar-plane image. Black and gray lines denote the visiblecally. (f) “Fishtail” ends are matched by overlapping D and F. Note and hidden depth discontinuities, respectively. Note the variabilitythat the original T-junction in B denotes a local concavity. in the shape and location of the curve joining the “fishtail” ends.3.1. Orthographic Projection Model axis. Two corresponding tangent rays are supported by the T The orthographic projection p = [u, v] of a point P = same straight line, and have opposite directions. Thus, the T[X, Y, Z] , defined in world coordinates, can be modeled as depth discontinuities detected in opposite images are spaced 180 degrees apart in epipolar-plane images; since they are p = KE P , (2) viewed from opposite directions, their projection in the EPI is related by an additional mirror symmetry. Furthermore,where x denotes the homogeneous vector obtained by ap- points on the visual hull are visible from both directions. Aspending 1 to the last element, and where the camera intrinsic a result, they produce two visible depth discontinuities andmatrix K and extrinsic parameters E are given by two corresponding points in the EPI. Other depth disconti- ⎡ ⎤ nuities are generated by tangent rays visible from only one α γ 0 0 R T direction, yielding only one point in the EPI. These effects K = ⎣ 0 β 0 0 ⎦ and E = . 0 1 are illustrated in Figure 1. 0 0 0 1The elements of K include the image skew factor γ ∈ R 3.3. Detecting and Filling Local Concavitiesand the scale factors {α, β} ∈ R+ . The matrix E repre-sents the rigid transformation that brings the world coordi- As shown in Figure 1, a set of cusps will be present innate system to that of the camera, composed of a rotation the EPI containing both visible and hidden depth disconti-R ∈ R3×3 and translation T ∈ R3 . The details of estimat- nuities. These cusps correspond to positions where locallying the parameters of this model, for a given orthographic convex visible points on the surface transition into locallyimaging system, are described in Section 5. concave hidden points. Every concavity in the primal curve For this model, consider the case of a cylinder of radius maps to a “fishtail” structure in the EPI. In other words, forr separated from the rotation axis by a distance R (with its a simple concavity, a T-junction will be present in the EPI,symmetry axis parallel to the rotation axis). The EPI for corresponding to a point of bitangency between the view-image row v will consist of two depth edge contours u+ (θ) ing ray and the surface [1]. By first shifting each EPI byand u− (θ), given by 180 degrees and reflecting it about the rotation axis, we can recover the two sides of the concavity (up to the points of in- u± (θ) = αR cos(θ) ± αr, (3) flection on either side). As a result, T-junctions can be used to isolate corresponding points on either side of a concav-assuming γ = 0. Locally, the surface boundary can be ap- ity. We note that higher-order junctions, while improbable,proximated by an osculating circle; as a result, we expect correspond to points of multiple-tangency and can be pro-the EPI image to consist of intersecting sinusoidal contours cessed in a similar manner.(as evidenced by the EPI image shown in Figure 3). We propose reconstructing the shape of the hidden sur- face within the concavity by first matching the “fishtail”3.2. Basic Concepts ends, and then connecting them with a curved segment tan- Epipolar-plane images produced using a setup composed gent to the two ends. As described in Section 2.4, such “hal-of an orthographic camera and a turntable, as described lucinated” curves have multiple solutions (see Figure 2). Inabove, show interesting symmetries which are easy to de- our implementation we chose to use a piecewise cubic Her-scribe. There is a one-to-one correspondence between mite interpolating polynomial [11], however this choice istangent rays (i.e., depth edges) sampled in images taken subject to the application and underlying properties of thefrom opposite points of view with respect to the rotation object being scanned. 1552
  4. 4. 4. Orthographic Multi-Flash Photography To verify the basic theory, we constructed a large-format orthographic multi-flash camera capable of recov-ering depth discontinuities using controlled illumination. Inthis section we describe the properties of multi-flash depthedge detection under orthographic projection. We begin byanalyzing the illumination and imaging conditions requiredto detect depth discontinuities. Consider the case when u upoint lights are used with an orthographic camera, with a θ θ Tpoint source located at L = [XL , YL , ZL ] , a depth discon- Ttinuity at D = [XD , YD , ZD ] , and a backdrop in the planeZ = ZB . The separation ΔX of the outside edge of the castshadow from the depth discontinuity, along the X-axis, is −(XL − XD )(ZB − ZD ) ΔX = . ZD − ZL Figure 3. Orthographic multi-flash depth edge detection. (Top left) Image acquired with a flash located to the right of the scene.Unlike the case of point light sources under perspective Note the cylindrical calibration pattern affixed to the turntable sur-projection [19], several limitations are imposed by ortho- face. (Top right) Depth edge confidence image estimated usinggraphic imaging. First, the light source must be located the method of Raskar et al. [19]. (Bottom left) An epipolar-planeto the right of the depth edge, in order to cast a shadow image extracted along the dashed red line shown above. (Bottomto the left (i.e., XL > XD ). As a result, the point light right) Detected ridges modeled as trigonometric polynomials. Allsources must be located outside the region of interest; oth- images are rectified so the rotation axis is the center column. Ad-erwise, shadows will not be visible. Second, the width of ditional detection examples are provided in the supplement.the cast shadow increases with the separation of the depthedge from the light source. This leads to a third limitation; a plane parallel to the image plane. Each image is recti-disconnected shadows will be more likely further from the fied so the projection of the rotation axis is aligned withlight source, with the following constraint on the minimum the central image column. As shown in Figure 3, eachwidth ΔD of a fronto-parallel plane to the camera. multi-flash sequence is decoded to obtain a depth edge con- (XL − XD )(ZB − ZD ) fidence image [19], estimating the likelihood of the pro- ΔD > jection of a depth discontinuity being located in any given ZB T pixel p = [u, v] . A set of nv epipolar-plane images is A simple solution to overcome these limitations is to use extracted by concatenating the confidence values for eachdirectional lights rather than point sources. In this case, the image row v over the turntable sequence. Each EPI consistswidth of the shadow is invariant to the position of the depth of an nθ ×nu slice of the confidence volume.discontinuity. As for perspective projection [19], discon-nected shadows can be eliminated by using a small angular 4.2. Epipolar-Plane Image Analysisvariation between directional lights. We observe that direc- As described in Section 3, each EPI can be processedtional lights achieve the original design goals of multi-flash in parallel to estimate the depth of each discontinuity, toimaging; specifically, the effective center of projection is identify local concavities, and to find smooth curves that fillnow located at Z = −∞. Since multi-flash edge detection remaining gaps. To accomplish these tasks, we require arequires point sources close to the center of projection, one parametric model for EPI ridges (see Figure 3). Cipolla andnaturally concludes that directional lights are the appropri- Giblin [6] use B-spline snakes to track silhouette bound-ate sources for the case of orthographic imaging. aries over turntable sequences. We note, however, that the ridges observed in depth edge confidence images contain4.1. Data Capture and Depth Edge Estimation high-frequency features and numerous junctions. In con- Our data capture process follows Crispell et al. [7]. An trast, the ridges in the EPI are generally low in frequencyopaque object is placed on a turntable and a sequence of and contain a small number of isolated junctions. As a re-nθ orthographic images, each with a resolution of nv ×nu , sult, we propose tracking ridges in the EPI domain recorded using point light sources, located outside the Specifically, we use a modified tracking procedure previ-entrance aperture of the telecentric lens (see Section 6.1). ously proposed for this problem [7]. We refer the reader toThe axis of rotation is manually-adjusted to be located in that paper for the specific details of their tracking algorithm. 1553
  5. 5. 40 Rather than modeling EPI ridges with quadratic polyno- umials [7], we use real trigonometric polynomials of order θN (N = 3 in our implementation), such that 20 Z (mm) N N 0 ui (θ) = ai0 + ain cos(nθ) + bin sin(nθ), (4) n=1 n=1 -20where ui (θ) models ridge i. Unlike polynomials, trigono- -40metric polynomials capture 2π-periodicity in the ortho- -40 -20 0 X (mm) 20 40graphic turntable sequences. Furthermore, a 1st -order Figure 4. Epipolar-plane image analysis and curve completion.trigonometric polynomial exactly models the EPI curve (Left) The tracked contours after merging detected segments (e.g.,generated by a cylinder under constant rotation given by (3). those shown in Figure 3) with the segments given by (5). Seg-For a piecewise-smooth surface, we expect both the curva- ments due to disconnected shadows are rejected by eliminating anyture and distance of the external boundary from the rotation curves projecting outside the visual hull. T-junctions correspond-axis will be piecewise-smooth functions. Thus, the surface ing to local concavities are indicated with red crosses. (Right) Re-can be represented as a sequence of osculating circles, each covered external boundary for corresponding image row (showngenerating a 1st -order trigonometric polynomial for small in black). Points on either side of each concavity, correspondingrotations. In this manner, we expect a low-order trigono- to EPI T-junctions, are linked with red lines. The piecewise cubic Hermite polynomial completions are shown in blue. Additionalmetric polynomial to closely-approximate EPI ridges. EPI reconstruction results are included in the supplement. Typical edge detection and linking results are shown inFigure 3. Note that the tracking algorithm fails to link con- ∂utours across some junctions and only recovers the visible set tives { ∂θi (ui (θ), θ)}, required for reconstruction, are com-of depth discontinuities. To compensate, the additional set puted using {ui (θ)}. The remaining task is to “hallucinate”{¯i (θ)} of EPI curves is obtained by shifting each tracked u curves connecting depth discontinuities on either side ofcontour by 180 degrees and reflecting about the rotation axis each T-junction. As described, numerous completions exist;(as described in Section 3.2), such that for instance, requiring continuity of position and tangents N between matched EPI contours imposes four constraints, re- ui (θ) = (−ai0 + 2u0 )+ ¯ (−1)n+1 ain cos(nθ) + sulting in a one-parameter family of 2nd -order trigonomet- n=1 ric polynomials. In our implementation we use Matlab to fit N a cubic Hermite interpolating polynomial to either side of (−1)n+1 bin sin(nθ), (5) each local concavity. Results are shown in Figure 4. n=1where u0 is the column containing the projected axis of ro- 5. Orthographic Camera Calibrationtation. The superset {ui (θ)} = {ui (θ)} ∪ {¯i (θ)} is pro- u Surface reconstruction using (1) requires accurate cam-cessed to obtain an estimate of the apparent contours (i.e., era calibration. In this section we describe a general methodboth visible and hidden depth discontinuities). Specifically, for estimating the parameters of the orthographic imagingany pair of curves ui (θ) and uj (θ), for i = j, are joined if model in Section 3.1. While Cipolla and Giblin [6] proposea significant region of overlap is detected. Typical refined a suitable method, their approach involves simultaneous so-apparent contour estimates are shown in Figure 4. lution of the intrinsic and extrinsic parameters. Generalized Following Section 3.3, local concavities correspond to methods have also been developed [24]. However, we pro-T-junctions in the EPI. We first estimate a candidate set of pose a factorized method inspired by Zhang [29], in whichlocal concavities using the pairwise-intersection of contours intrinsic and extrinsic parameters are separately estimatedin {ui (θ)}. Each contour pair produces a trigonometric from multiple images of a planar checkerboard pattern.polynomial whose coefficients are given by the difference By convention the checkerboard lies in the plane Z = 0,of the two coefficients for each curve, the roots of which eliminating the third row/column from E so (2) becomescorrespond to the points of intersection [21]. Only those ⎡ ⎤⎡ ⎤⎡ ⎤points of intersection that are near T-junctions are retained. α γ 0 r11 r12 t1 XIn our implementation we use steerable filters [23, 1] to es- p = ⎣ 0 β 0 ⎦ ⎣ r21 r12 t2 ⎦ ⎣ Y ⎦timate the T-junction likelihood at each point in the EPI. 0 0 1 0 0 1 14.3. Surface Reconstruction = K sEsP s, The set of tracked EPI contours and calibration are used where Rs ∈ R2×2 and T s ∈ R2 represent the correspond-to recover the scene depth using (1). The analytic deriva- ing truncated rotation and translation matrices, respectively. 1554
  6. 6. 5.1. Estimation of Intrinsic Parameters Assuming that the homography H ∈ R3×3 mapping thepoints on the checkerboard to those in the image [29] hasalready been computed, we have p = H P such that H = K sEs. (6)Multiplying both sides of this expression by K −1 and then s (a) system architecture for orthographic multi-flash photographymultiplying both sides by their respective transposes yields T RT Rs RT T s H T (K −1 ) K −1 H = s s . (7) s s T T Rs s T TT s sWe note that the upper left submatrix RT Rs can be used sto solve for the intrinsic calibration parameters. First, therotation matrix R is expressed in terms of Rs as Rs A = R, (b) multi-flash camera (c) turntable, backdrop, and object BT c Figure 5. System architecture for multi-flash detection of depth discontinuities under orthographic projection. (a) The systemwhere {A, B} ∈ R2 and c ∈ R. Next, both sides of this contains an orthographic camera, a multi-flash LED array, and aexpression are multiplied, on the left, by their transposes computer-controlled turntable. (b) A digital camera is mounted toand the upper left 2×2 submatrix is extracted to obtain a large-format telecentric lens, with an LED array surrounding the entrance aperture. (c) The object is placed on a turntable, with a RT Rs + BB T = I 2 , s backdrop included to facilitate silhouette detection.where I 2 denotes the 2×2 identity matrix. Since B T B isrank deficient, we find that Equation (10) defines r31 and r32 up to an ambiguity in sign, with two degrees of freedom remaining in the solu- det(RT Rs − I 2 ) = 0. s (8) tion. Equation (11) restricts the sign so that only one degree of freedom is left. The remaining ambiguity cannot be re-Equations (8) and (7) can be combined to obtain covered and corresponds to the fact that the orthographic projection of a planar object is invariant when the object x1 − x2 h2 + h2 − x3 h2 + h2 + 21 22 11 12 is reflected across the XY -plane. To recover the correct 2x4 (h11 h21 + h12 h22 ) = −(h12 h21 − h11 h22 )2 , sign, extra information is required; in our implementation we modify the checkerboard plane with a pole of knownwhere x1 = α2 β 2 , x2 = α2 + γ 2 , x3 = β 2 , and x4 = height, perpendicular to the surface (see the supplement forβγ. Since there is one homography for each image of the more details). This pole is parallel to the third column of thecheckerboard, a minimum of four images are required to rotation matrix and, by dividing the coordinates (in worldrecover the unknowns {xi }. The intrinsic parameters α, β, units) of its projection by the physical length of the pole,and γ can then be recovered from {xi }. we can recover the first two components of the third col- umn of the rotation matrix. Finally, (9) is used to obtain the5.2. Estimation of Extrinsic Parameters last elements of the first two columns of the rotation matrix. Following intrinsic calibration, the extrinsic parameters r1i r13 + r2i r23 r3i = , for i = {1, 2}.Rs and T s can be recovered using (6). The full rotation ma- det(Rs )trix R can be recovered from Rs by requiring the columnsof R to be orthogonal and unitary. The following set of 5.3. Optimization of Model Parametersconstraints are given by ensuring orthonormality. We further refine the initial parameter estimates by min- r3 = r1 × r2 (9) imizing the distance between the measured and predicted image coordinates of the checkerboard corners, given by 2 2 r3i = ± 1 − r1i − r2i , for i = {1, 2} (10) nC nP 2 pij − K s E sj P si , (12) r31 r32 = −r11 r12 − r21 r22 (11) j=1 i=1 1555
  7. 7. where nC is the number of cameras, nP is the numberof checkerboard corners, and {E s1 , E s2 , . . . , E snC } arethe extrinsic calibration parameters for each checkerboardpose. This error function is minimized using the Levenberg-Marquardt algorithm. Even if we do not use the modifiedcheckerboard pattern, Equation (12) can still be used to re-fine our estimate of the intrinsic parameters; in this case, itsuffices to arbitrarily choose the remaining sign for R (re-sulting in the same projected image).6. Implementation and Results6.1. Implementation Following the design in Section 4, we constructed Figure 6. 3-D surface reconstructions without (left) and withthe prototype shown in Figure 5. The orthographic (right) curve completion. Each EPI, similar to Figure 4, was re- constructed independently using the method in Section 4. All ren-imaging system consists of a 1600×1200 24-bit color derings were produced using Pointshop3D [30], with surface nor-video camera from Point Grey (model GRAS-20S4M/C), mals obtained using the Diffuser plug-in. Shading variation is duemounted with a large-format bi-telecentric lens from Opto- to surface normal estimation errors using the plug-in. AdditionalEngineering (model TC 12 144) with a field of view of views of this model are included in the supplement.81.0mm×60.7mm. The multi-flash illumination system iscomposed of an array of eight Philips White Luxeon IIILEDs (model LXHL-LW3C), individually controlled via 7. Discussion and Future Workthe parallel port using BuckPuck 1000mA DC LED drivers.The object is rotated using a Kaidan Magellan Desktop 7.1. Benefits and LimitationsTurntable (model MDT-19). A backdrop is placed behindthe object to aide in silhouette extraction. Benefits: Recovered points are obtained along surface A typical capture sequence consists of 670 viewpoints, patches oriented orthogonal to the viewing direction, allow-separated by a rotation of approximated 0.527 degrees. For ing a subset of feature lines to be directly sampled. Theeach viewpoint four images are recorded in which the scene reconstruction and completion schemes are highly paral-is sequentially illuminated by the top, bottom, left, and right lel. The unique properties of orthographic imaging allow aflashes. The intrinsic parameters of the camera are mea- subset of apparent contours to be estimated, including bothsured once, using the procedure described in Section 5. The visible and hidden depth discontinuities. Because the sam-extrinsic calibration is estimated by tracking the corners of pling rate of reconstructed points is proportional to surfacethe cylindrical checkerboard pattern shown in Figure 3. The curvature, we recover points that are sparsely-sampled byvarious post-processing steps, including image rectification, many optical shape capture methods. Similarly, the pro-depth edge estimation, epipolar-slice analysis, and recon- posed shape completion scheme can be evaluated indepen-struction were implemented in Matlab and were evaluated dently for each EPI. While this can lead to artifacts, it alsoon a PC cluster.The data capture process can be completed facilitates parallel under two hours, and the reconstruction pipeline requires Limitations: Recovering 3-D shapes using the proposedseveral hours using the current implementation. procedure has several significant drawbacks. Because an or- thographic camera is required, only a small volume can be6.2. Results scanned, since large-format telecentric lenses are typically To evaluate the overall system performance, the test ob- cumbersome and expensive. The properties described inject in Figure 3 was reconstructed using the procedure from Section 3 only hold under ideal orthographic projection. InSection 4. Each EPI, similar to Figure 4, was evaluated practice it is challenging to ensure proper mechanical align-independently. Gaps were filled using the procedure from ment of the rotation axis to achieve this condition. BecauseSection 4.3. The reconstruction results, with and without each EPI image is processed independently, the “halluci-curve completion, are compared in Figure 6. Additional nated” completions may be inconsistent between layers. Asresults are included in the supplementary material. Note described in Section 4, using point light sources with ortho-that independent surface completions exhibit strong coher- graphic imaging is sensitive to disconnected shadows. Theent across neighboring layers. While post-processing of the model-based detection and linking of EPI ridges is suscep-point cloud was not applied in this example, such process- tible to errors and, as a result, certain T-junctions may noting would further improve inter-layer coherence. be detected, leaving gaps in the recovered surface. 1556
  8. 8. 7.2. Future Work [8] B. Curless and M. Levoy. A volumetric method for building complex models from range images. In SIGGRAPH, 1996. In the near term, several limitations can be addressed by [9] J. Davis, S. Marschner, M. Garr, and M. Levoy. Filling holesa revised prototype. Detached shadows can be eliminated in complex surfaces using volumetric diffusion. In 3DPVT,using directional illumination. We propose a simple solu- 2002.tion, in which a beam-splitter is positioned at a 45 degree [10] I. Feldmann, P. Eisert, and P. Kauff. Extension of epipolarangle in front of the telecentric lens. A Fresnel lens can be image analysis to circular camera movements. In ICIP, 2003.placed on the new optical path, with a small-baseline LED [11] F. N. Fritsch and R. E. Carlson. Monotone piecewise cubicarray located at its focus, achieving directional lighting. In interpolation. SIAM J. Num. Analysis, 17:238–246, 1980.the long term, we are interested in exploring the problem of [12] B. Horn. The curve of least energy. ACM Trans. Math. Soft.,shape from depth discontinuities. We are encouraged that 9(4), 1983. [13] B. Kimia, I. Frankel, and A. Popescu. Euler spiral for shapeour approach acquires samples traditionally lost by many completion. In IJCV, 2003.optical scanning methods. Future work will examine the [14] D. Knuth. Mathematical typography. American Mathemati-traditional case of perspective imaging with algorithms that cal Society, 1(2), 1979.are less susceptible to EPI segmentation and tracking errors. [15] A. Laurentini. The visual hull concept for silhouette-based image understanding. IEEE TPAMI, 16(2), 1994.8. Conclusion [16] Y. Li, S. Lin, H. Lu, and H.-Y. Shum. Multiple-cue illumi- nation estimation in textured scenes. In ICCV, 2003. While shape-from-X is a crowded field, shape from [17] S. Mada, M. Smith, L. Smith, and P. Midha. Overview ofdepth discontinuities remains a promising direction of fu- passive and active vision techniques for hand-held 3D datature research. Scanning the feature lines typical of ma- acquistion. In SPIE, 2003.chined parts remains a challenging problem. In this pa- [18] M. Pauly, N. J. Mitra, J. Wallner, H. Pottmann, andper we have proposed several practical solutions, including L. Guibas. Discovering structural regularity in 3D geome-characterizing the performance of multi-flash photography try. ACM Trans. Graph., 27(3), 2008.under orthographic projection and robust calibration of tele- [19] R. Raskar, K.-H. Tan, R. Feris, J. Yu, and M. Turk. Non-centric lenses. We have also made several theoretical con- photorealistic camera: depth edge detection and stylized ren-tributions to the understanding of the visual motion of depth dering using multi-flash imaging. ACM Trans. Graph., 23(3),discontinuities, including procedures for modeling, link- 2004. [20] J. Salvi, J. Pag` s, and J. Batlle. Pattern codification strategies eing, and completing gaps using properties unique to ortho- in structured light systems. Pattern Recognition, 37:827–graphic projection. We hope our methods and prototype in- 849, 2004.spire researchers to continue applying non-traditional cam- [21] A. Schweikard. Real zero isolation for trigonometric poly-eras to the established field of shape-from-X. nomials. ACM Trans. Math. Soft., 18(3), 1992. [22] S. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski.Acknowledgments A comparison and evaluation of multi-view stereo recon- struction algorithms. In CVPR, 2006. This material is based upon work supported by the Na- [23] E. P. Simoncelli and H. Farid. Steerable wedge filters fortional Science Foundation under Grant No. CCF-0729126. local orientation analysis. IEEE TIP, 5, 1996. [24] P. Sturm, S. Ramalingam, and S. Lodha. On calibra-References tion, structure from motion and multi-view geometry for [1] N. Apostoloff and A. Fitzgibbon. Learning spatiotemporal generic camera models. Computational Imaging and Vision. T-junctions for occlusion detection. In CVPR, 2005. Springer, 2006. [2] H. Baker and R. Bolles. Generalizing epipolar-plane image [25] S. Ullman. Filling-in the gaps: The shape of subjective con- analysis on the spatiotemporal surface. In IJCV, 1989. tours and a model for their generation. Biological Cybernet- [3] J. Berent and P. Dragotti. Segmentation of epipolar-plane ics, 25(1), 1976. image volumes with occlusion and disocclusion competition. [26] J. Verdera, V. Caselles, M. Bertalmio, and G. Sapiro. In- In IEEE Multimedia Signal Processing, 2006. painting surface holes. In ICIP, 2003. [4] F. Blais. Review of 20 years of range sensor development. [27] M. Watanabe and S. K. Nayar. Telecentric optics for compu- Journal of Electronic Imaging, 13(1), 2004. tational vision. In ECCV, 1996. [5] R. C. Bolles, H. H. Baker, and D. H. Marimont. Epipolar- [28] R. Woodham. Photometric method for determining sur- plane image analysis: An approach to determining structure face orientation from multiple images. Optical Engineering, from motion. In IJCV, 1987. 19(1), 1980. [6] R. Cipolla and P. Giblin. Visual Motion of Curves and Sur- [29] Z. Zhang. Flexible camera calibration by viewing a plane faces. Cambridge University Press, 2000. from unknown orientations. In ICCV, 1999. [7] D. Crispell, D. Lanman, P. G. Sibley, Y. Zhao, and G. Taubin. [30] M. Zwicker, M. Pauly, O. Knoll, and M. Gross. Pointshop Beyond silhouettes: Surface reconstruction using multi-flash 3D: an interactive system for point-based surface editing. In photography. In 3DPVT, 2006. SIGGRAPH, 2002. 1557