Ulusoy cvprw2010


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Ulusoy cvprw2010

  1. 1. Robust One-Shot 3D Scanning Using Loopy Belief Propagation Ali Osman Ulusoy∗ Fatih Calakli∗ Gabriel Taubin Brown University, Division of Engineering Providence, RI 02912, USA {ali ulusoy, fatih calakli, taubin}@brown.edu Abstract A structured-light technique can greatly simplify theproblem of shape recovery from images. There are currentlytwo main research challenges in design of such techniques.One is handling complicated scenes involving texture, oc-clusions, shadows, sharp discontinuities, and in some caseseven dynamic change; and the other is speeding up the ac-quisition process by requiring small number of images andcomputationally less demanding algorithms. This paper (a) (b)presents a “one-shot” variant of such techniques to tacklethe aforementioned challenges. It works by projecting astatic grid pattern onto the scene and identifying the corre-spondence between grid stripes and the camera image. Thecorrespondence problem is formulated using a novel graph-ical model and solved efficiently using loopy belief propa-gation. Unlike prior approaches, the proposed approachuses non-deterministic geometric constraints, thereby canhandle spurious connections of stripe images. The effec-tiveness of the proposed approach is verified on a variety ofcomplicated real scenes. (c) (d) Figure 1: (a) A bust, (b) the intersection network overlaid1. Introduction onto the image, (c,d) reconstructions from two views. Three-dimensional (3D) shape acquisition from images a projector. It is capable of scanning objects when manuallyhas been one of the most important topics in computer vi- swept over them.sion due to its wide range of applications in engineering, Typically, a hand-held scanner works by continuouslyentertainment, visual inspection and medicine. It has espe- registering a sequence of 3D views as they are acquired.cially been instrumental in digitizing and preserving archae- Obviously, the 3D acquisition has to be done very quicklyological artifacts. Although numerous 3D reconstruction (ideally within a single frame) as the scanner is constantlymethods have been proposed for this purpose, there isn’t moving. Most commercial hand-held scanners use a singleone solution that is suitable for all sorts of archaeological laser beam for this purpose which can reconstruct only aobjects. Challenges include shadowy areas, immobility of single slit per frame. In such systems, the scanner has to bethe object, complex geometry that prevents reaching cer- swept very slowly to capture all the intricate details. Also,tain areas, fragile nature of old artifacts, etc. Some of these since the reconstruction is sparse, the registration step canchallenges can be alleviated by a hand-held scanner which easily fail if the scanner moves too fast or makes an abruptcontains one or more cameras and possibly a laser emitter or motion. ∗ First two authors contributed equally to this work and order of author- In this paper, we propose using a grid pattern to han-ship was decided through flipping a coin. dle some of these challenges. This pattern allows for build- 1 978-1-4244-7030-3/10/$26.00 ©2010 IEEE 15
  2. 2. ing a “one-shot” system while still maintaining a dense re- cated image processing and possibly color calibration [4].construction. We argue that previous approaches for grid Recently, Kawasaki et al. [5] proposed a two color gridbased systems are not adequate for dealing with scenes of where vertical and horizontal lines are of different colors.complex geometry. To alleviate this, we introduce a novel Unlike most other previous approaches, their method doesprobabilistic graphical model and show that it can deal with not rely heavily on image processing and is robust againstsuch scenes. Combined with an online registration method, undetected grid points as it does not assume relative order-we think that this method shows great potential for use in a ing. Their solution includes singular value decompositionhand-held scanner. (SVD) on a large and very sparse matrix, which is numer- ically instable. In [16], Ulusoy et al. propose a more sta-2. Related Work ble solution with the help of a special grid. Both meth- ods heavily exploit the coplanarity constraints that arise due There exists several 3D reconstruction techniques such to the links between two grid points. However, such con-as structured light, multi-view stereo vision and photomet- straints are often violated because spurious links betweenric stereo. Structured light techniques have been very popu- grid points are observed in complex scenes. It is very impor-lar in applications to archeology due to their accuracy, den- tant to note that even a single spurious link typically causessity and ease of use [9, 3]. Since a high variety of ap- the entire 3D solution to be incorrect, rendering both theseproaches have been proposed to this day, we will only dis- methods unsuitable for such scenes.cuss some of the most relevant ones here. A comprehensive The primary contributions of this paper are: (1) formu-assessment is presented in [14, 1] by Salvi et al. lation of the problem using a novel probabilistic graphi- Most of the early work in structured lighting follow a cal model, (2) handling spurious connections resulting fromtemporal approach: multiple patterns are projected consec- complicated scenes, (3) an efficient solution using loopy be-utively onto the object and a camera captures an image for lief propagation.each projected pattern. These methods require the objectto be perfectly still while the patterns are being projected. 3. Overview of the ApproachObviously, this is not acceptable for a hand-held device asthe scanner is constantly moving. If however, the number In this paper, we present a structured-light approach con-of patterns to be projected is reduced to a single pattern, sisting of a projector, and a single color camera. Both arethen such methods become feasible. Although there are also calibrated with respect to a world coordinate system. Thetechniques that use a single but adaptive pattern [7], they re- projector projects a known grid pattern onto a 3D object.quire a video projector. In this study, we focus and review The camera captures an image of the object illuminated bymethods that project a single static pattern in which case a the projected grid. This single image is used to determineslide or laser projector would suffice. the depth information of points illuminated by grid stripes Numerous one-shot patterns have been proposed so far. observed by the camera. Producing the correct result de-One of the simplest approaches is to project a single slit, pends on being able to solve the so-called correspondencetypically using a laser beam. Although simple and easy to problem, i.e., matching pairs of grid points and their imagesimplement, these approaches suffer from sparsity, i.e., only need to be identified.a single slit is reconstructed at each frame. A natural ex- At first, finding correspondences might seem to requiretension is to project multiple slits or to have both horizontal an exhaustive search throughout the whole grid. However,and vertical lines which form a grid pattern. In [12], Ru and geometric constraints greatly simplifies the problem. First,Stockman suggest using such a grid pattern. As for their the projector-camera system adds epipolar constraints thatsolution, the observed grid in the camera is matched to the reduce the search space of each grid point to a single line,projected pattern by exploiting a number of geometric and i.e., the correspondence of the point can only come from onetopological constraints. A major drawback is that the de- of grid crossings along its epipolar line. This is explainedtected grid points are numbered with respect to a reference in Section 4.1.1. Second, the grid pattern discloses spatialpoint, i.e. relative numbering. This necessitates that at least neighborhood information for all captured grid points. Thisa patch of the pattern be extracted perfectly since in the case adds coplanarity constraints and topological constraints onof undetected or spurious grid points, the algorithm will fail. their correspondences, i.e., if two grid points are linked hor-Although Proesmans et al. [11] present a more efficient so- izontally on the captured image, then their correspondenceslution to the grid based systems, they suffer from relative must abide to certain constraints. These are explained innumbering also. detail in Sections 4.1.2 and 4.1.3. Another popular approach to obtain a dense reconstruc- The problem is still challenging as difficulties arise, i.e.,tion is the use of colors [18, 4]. These approaches typically establishing correspondences can be easily tricked by ause complex illumination patterns and are highly sensitive complex scene that involves texture, occlusion, shadows,to noise and object texture. Therefore they require sophisti- sharp discontinuities, and dynamic change [7]. As a result, 16
  3. 3. (a) (b)Figure 2: (a) A grid pattern, (b) two separate intersection Figure 3: A point u and its epipolar line L(u).networks. light line and every grid line forms a light plane in 3-space. These are reflected on the object surface and identified inobserved geometric constraints can be misleading which is the camera image. The detection of such a light line is re-explained in Section 4.1.4. We present a novel probabilistic ferred to as an intersection point u = (u1 , u2 , 1)t in thegraphical model in Section 4.2 that aims to formulate these camera image. The reflection of the light plane is referredgeometric constraints in a stochastic fashion. After build- to as a set of links that connect the neighboring intersectioning the graphical model, the solution is obtained efficiently points. We treat the union of links and intersection pointsusing loopy belief propagation. as an intersection network. A grid crossing s parameterizes both the horizontal line4. Reconstruction using Grid Pattern π h and the vertical line π v it lies on. Therefore, a horizontal and a vertical triangulation plane is defined by the the grid We assume that the camera and the projector are cali- crossing s. Given an intersection u, and its correspondencebrated. The projector projects a grid pattern, composed of s, all the points in the intersection network that are horizon-horizontal blue stripes and vertical red stripes as shown in tally and vertically linked can be triangulated using opticalFigure 2a, to the measured object and the camera is placed ray-plane triangulation. Thus, in order to triangulate all theto detect it. A simple image segmentation process in the points in an intersection network, we only need correspon-captured image, e.g., thresholding color channels, differ- dences s of intersection points u. Formally, we are lookingentiates between the horizontal and vertical stripes. The for a labeling functionstripes are then combined to form an input to the system- a 2D intersection network. However, they, in general, may f : ui → S, i = 1 . . . N (1)not result in a single intersections network due to shadows,occlusions and sharp depth discontinuities. An example de- where S = {s1 , s2 , . . . , sM }. N and M are the number ofpicting this is given in Figure 2b. Our approach solves the all intersection points and grid crossings respectively.correspondence problem for each intersection network in-dependently. This section explains the steps for solving the 4.1.1 Epipolar Constraintcorrespondence of a single intersection network. Withoutloss of generality, the same steps can be applied to all inter- Assuming that the world coordinate system is placed at thesection networks independently. optical center of the camera and ignoring intrinsic param- As for the notation used in this section, all image mea- eters, the equation of the projection of a 3D point p =surements refer to the homogeneous normalized coordi- (p1 , p2 , p3 )t onto the normalized camera image point u isnates because the calibration is known for both camera and λu = p. We define another coordinate system at the opticalprojector. center of the projector for convenience and denote projector image point as s. The relation between u and s is4.1. Labeling Problem µs = λRu + T (2) The grid pattern is a set of horizontal and vertical stripes.Suppose those stripes have no width, then we can refer to where both λ and µ are unknown scalar values, R is thea stripe as a line π. The intersection of two perpendicu- rotation matrix and T is the translation vector which definelar lines is a grid crossing s = (s1 , s2 , 1)t in the projector the coordinate transformation between the two coordinateimage. Due to the projection, every grid crossing forms a systems. 17
  4. 4. Figure 4: A typical scenario revealing coplanarity con- Figure 5: Challenges in finding correspondences, (I) an oc-straints and topological constraints cluded stripe, (II) a spurious link. h u2 must lie on the horizontal line π1 . More generally, if By eliminating the unknown scalars λ and µ from equa- two detected intersections are linked in the camera image,tion (2), we retrieve the epipolar line L of the camera point then their correspondences must be collinear in the projec-u in projector image as tor image. If the link is horizontal, then their correspon- L(u) = {s : lt s = 0} (3) dences must have the same j value, where the integer j is the horizontal line identifier within the set of H horizontal ˆ ˆwhere l = T Ru , and T is the matrix representation of the grid lines. Similar constraints can be derived for the verticalcross product with T . This representation follows that the case.Euclidean distance from a grid crossing s to the epipolarline L(u) is 4.1.3 Topological Constraint dist(s, L(u)) = |st L(u)|. (4) Again, consider the detected intersections u1 and u2 in the example of Figure 4. These points are linked horizontally inEquation (3) says that a captured grid crossing u in the the camera image, and we know that their correspondencescamera image may correspond only to grid crossings on the must be collinear. There is more than that. In the physicalepipolar line L(u) in the projector image as shown in Figure layout of the camera image points, u1 is on the left of u2 .3. However, a correspondence typically does not lie exactly This is due to the fact that the 3D point p1 is also on the lefton the epipolar line due to imperfections in calibration pro- of p2 . Therefore, the correspondence of u1 must also becedure. Therefore, we define a threshold τ on the distance on the left of the correspondence of u2 . More generally, ifdist(s, L(u)), and consider only the grid crossings s ∈ S two intersections are linked in the camera image, then theirwhose distance is smaller than τ . Note that the size S is topological structure is preserved for their correspondencesdramatically smaller than that of S. in the projector image. If the link is horizontal, then the correspondence for the intersection on the left must have4.1.2 Coplanarity Constraint a lower k value than that of the intersection on the right,A grid pattern consists of a set of discrete horizontal lines where the integer k is the vertical line identifier within the hπj , j = 1, 2, . . . , H, and a set of discrete vertical lines set of V vertical grid lines. Similar constraints can be de- rived for the vertical case.πk , k = 1, 2, . . . , V . The projection of a horizontal line π h vforms a horizontal plane Πh in 3-space. Similarly, a vertical 4.1.4 Spurious Connectionline π v forms a vertical plane Πv . The reflection of thoseplanes are observed by the camera. In the example of Figure Difficulties in finding the correct correspondences may4, an intersection network of four camera points u1 to u4 arise, even when exploiting the geometric and topologicalare given. These points correspond to 3D-points p1 to p4 , constraints. Typical complications are shown in Figure 5,and also to grid crossings s1 to s4 . Consider the horizon- i.e., a depth discontinuity that is parallel to the grid lines oftal link between u1 and u2 . Notice that the 3D points p1 the pattern may cause them to disappear in the camera im-and p2 which generated u1 and u2 lie on the same plane age, whereas a depth discontinuity that is not parallel to theΠh . This plane is originated by the horizontal line π1 in the 1 h grid lines can let two different grid lines appear as a singleprojector image. Therefore, the correspondences of u1 and link in the camera image, namely a spurious link. There- 18
  5. 5. fore, observed links in the camera image may be misleadingfor establishing correspondences. A probabilistic graphical model is ideal for combiningthe aforementioned geometric and topological constraints,and also account for the non-deterministic nature of theproblem.4.2. Probabilistic Graphical Model Formulation We start by interpreting the intersection network in thecamera image as a graph G = (V, E). Each intersection (a) (b)is a node v ∈ V and links between intersections are edgese ∈ E. We distinguish between the set of horizontal and Figure 6: (a) An example intersection network, (b) its cor-vertical edges as Ehor and Ever where E = Ehor ∪ Ever . responding factor graph. Note that some of the factor labelsNext, we define X = {X1 , X2 , ..., XN } to be N discrete were omitted for clarity.valued random variables, where each random variable is as-sociated with a node in V. Random variable Xi represents The unary factors ψepipole (Xi = s) models the epipo-the correspondence of an intersection ui to a grid intersec- lar constraint as described in Section 4.1.1. Note that as-tion s in the projector image. The labeling problem is to sociated with each random variable Xi we have a positionassign a state (correspondence) to each random variable. ui in the camera image. Correspondences sufficiently dis- In this section, we show that we can exploit the geomet- tant from the epipolar line get probability 0, i.e., they areric constraints of our system easily by modeling them with a discarded entirely and the ones closer get a score based onprobabilistic graphical model. Furthermore, we realize that their distance to the line. More formally, we have:solving the correspondence problem becomes equivalent tofinding the maximum a posteriori (MAP) configuration, forwhich we propose a solution based on loopy belief propa- 1/dist(s, L(ui )), if dist(s, L(ui )) ≤ τ ψepipole (Xi = s) =gation [10]. 0, otherwise We use the factor graph notation presented in [8] where (6)each random variable is depicted as a variable node and The pairwise factors ψhor and ψver model both thefunctions of these variables are depicted as factor nodes, il- coplanarity constraint and the topological constraints due tolustrated as black squares. Edges connect variables to factor the horizontal and vertical links in the camera image. Thenodes if and only if that variable is an argument of the func- coplanarity constraint as described in Section 4.1.2 statestion. We use the factor node and the function it represents that if two intersections are linked with a horizontal edge,interchangeably. then the respective two random variables must be assigned For our graph, we have two types of factor nodes. Unary correspondences with the same horizontal line identifier,factors connect to one variable node as they represent a otherwise the compatibility is 0. If this constraint holds,function of a single random variable. This function mod- then we check for the topological constraint which is de-els the epipolar line constraint and is denoted as ψepipole . scribed in Section 4.1.3 using the function φ. ψhor is for-Pairwise factors connect to two variable nodes as they are mally defined in Equation 7. ψver can be defined similarly.functions of a pair of random variables. These functionsmodel both the coplanarity constraint and the topologicalconstraint due to vertical and horizontal links and are de- ψhor (Xi = s, Xj = s ) = (7)noted as ψver and ψhor respectively. The joint probability φ sgn(ui1 − uj1 ) . (idv (s) − idv (s )) , if idh (s) = idh (s )can be written as a product of the unary and pairwise fac- 0, otherwisetors, N where idh (s) and idv (s) are functions that take a grid inter- p(X ) ∝ ψepipole (Xi ) ψhor (Xi , Xj ) section s and return the associated horizontal and vertical i=1 (i,j)∈Ehor identifiers (described in Section 4.1.2) respectively. Note (5) that if ui is to the right of uj , the sgn function returns 1 and ψver (Xi , Xj ) the argument (idv (s)−idv (s )) is passed to φ (see Equation (i,j)∈Ever 8). As seen in Figure 7, the φ is nonzero only when thereNote that we have introduced factor nodes ψver and ψhor is a positive difference between the vertical line identifiers,for each e ∈ Ever and e ∈ Ehor respectively. An example i.e., s is to the right of s . If ui is to the left of uj , the sgnfactor graph construction is given in Figure 6. function corrects the argument to φ by reversing it. 19
  6. 6. exp(1 − α), if α ≥ 1 φ(α) = (8) 0, otherwise As seen in Figure 7, φ achieves its maximum when Figure 8: We sum over the binary edges as we do not need toα = 1. Notice that if φ were 0 for α > 1, then our model estimate them. Unary factors have been omitted for clarity.wouldn’t be able to handle occlusions, i.e. undetected gridpoints. We use an exponentially decaying function afterα = 1. This suggests that we do allow for large differences 4.2.2 Loopy Belief Propagation: Max-Productin vertical line identifiers but we penalize so as to keep the The correspondence problem for our model is finding thecorrespondences closer to each other. Note that a more for- most likely assignment to each random variable, X = ˆgiving function (less sharp than the exponential) could also ˆ ˆ ˆ {X1 , X2 , ..., XN } = arg maxX p(X ), i.e., the MAP esti-be used. mation problem. If our graph did not contain cycles (loops), we could have used the max-product algorithm [17] to get an exact answer. This algorithm is a variant of the sum- product algorithm [8] and belief propagation [10]. Unfor- tunately, our graph contains loops in which case the MAP estimation problem becomes NP hard and the max-product is no longer exact. However, approximate solutions can still be obtained by running it in an iterative fashion. Remark- ably, max-product has been shown to be very effective in practice and is widely used [15]. There exists also theo- retical work on explaining the empirical success of max- product [17]. Figure 7: The φ function. Max-product is an iterative algorithm that works by pass- ing messages in a graphical model [17]. Each message en- codes the likelihood of a random variable using all the infor-4.2.1 Dealing with spurious connections mation that has propagated throughout the graph. AlthoughTo deal with these spurious connections described in Sec- both the sum-product and max-product algorithms can betion 4.1.4, we introduce binary nodes bk , k = 1, ..., |E| for defined for factor graphs [8], the equations can be simplifiedeach edge which are connected to the factor nodes as de- when the graph has only pairwise factors, i.e., functions ofpicted in Figure 8a. We also define a prior λk for each bk . up to two random variables. This is indeed the case for ourWhen the bk = 1, we assume the edge between Xi and graph so we will present the algorithm in this form.Xj actually exists so we use the old compatibility measure A message at iteration t is defined as follows:ψver or ψhor . However when bk = 0, we assume the edge mt (Xi ) ∝ i→j (11)does not exist, in which case Xi and Xj become indepen-dent as there is no function binding them. So, we assign max ψepipole (Xi )ψcomp (Xi , Xj ) mt−1 (Xi ) k→i Xiuniform compatibility to all possible pairs of values. This is k∈N (i)jformalized in Equation 9. ¯ where ψcomp denotes either horizontal (ψhor ) or vertical ¯ (ψver ) compatibility depending on how Xi and Xj are con- nected. N (i)j means the neighbors of i except for j. After ψhor (Xi , Xj ), if bk = 1 ψhor (Xi , Xj , bk ) = (9) passing messages around for t iterations, the MAP configu- c, otherwise ration can be computed using the equation below.where c is a constant. Note that for our purposes, we do not ˆneed to estimate bk . Thus, we can simply sum them out as Xi = arg max ψepipole (Xi ) mt−1 (Xi ) k→i (12) Xi ¯shown in Figure 8b. We end up with factor ψhor defined in k∈N (i)Equation 10. One can derive ψ ¯ver similarly. 5. De Bruijn Spaced Grids It is known that when solving for correspondences us- ¯ ψhor (Xi , Xj ) = λk ψhor (Xi , Xj ) + (1 − λk )c (10) ing uniformly spaced grid patterns, one usually cannot 20
  7. 7. (a) (b) (c) (d) Figure 9: (a) A comfortable frog, (b) the intersection network overlaid onto the image, (c,d) spurious connections.obtain a confident solution and is left with ambiguities c = 0.01 will suffice.[5, 16, 13]. These patterns are especially ill-suited for com- For the De Bruijn sequence parameters, we saw thatplicated scenes with shadows, occlusions and sharp depth k = 5 and n = 3 gave good results empirically as almostdiscontinuities. In such scenes, the grid pattern usually gets all detected patches were bigger than the needed size forbroken into multiple connected components as seen in Fig- unique identification and since k is not very large, we didure 2b. Most components contain a small number of inter- not sacrifice much on reconstruction density. An examplesections and links in which case there is very little geomet- pattern is given in Figure 2a.ric information to solve for correspondences. This informa- To assess the correctness of our results, we used ation is usually not enough to obtain the correct solution. well established temporal structured lighting method [6] as A similar argument is made in [5] and irregular spac- ground truth. We ran both our method and the temporalings are suggested to disturb the uniformity. The authors method on static scenes and checked if all the intersectionsuse a pattern with uniformly spaced vertical lines and ran- we found were assigned their true correspondences.domly spaced horizontal lines. This indeed increases the First, we present results for a complicated object withrobustness of the search, however, does not guarantee cor- sharp depth discontinuities. The object and the intersectionrect convergence for their case. Recently, De Bruijn spaced networks we capture are given in Figures 9a and 9b. Thegrids were proposed in [16]. These are grid patterns with network (including the intersections on the background) isspacings that follow a De Bruijn sequence. A k-ary De composed of 6 connected components. Four of these com-Bruijn sequence of order n is a cyclic sequence containing ponents contain less than 40 points. Figures 9c and 9d de-letters from an alphabet of size k, where each subsequence pict the spurious connections outlined in bold red. The firstof length n appears exactly once. one is due to the sharp depth discontinuity between the head We generate grid patterns where both the vertical and and the arm. Similarly, the second one is caused by the dis-horizontal spacings between the stripes follow a De Bruijn continuity between the head and the belly. The third one issequence. Thus, a 2D patch consisting of n vertical spac- more interesting: the foot gets linked to the background.ings and n horizontal spacings and containing (n + 1)2 gridcrossings is unique in the whole grid pattern. This makes We have triangulated all the intersections and curves andsure that there is a unique configuration for which the joint then used interpolation to fill in between curve segmentsprobability is maximum, i.e., the correct MAP estimate is for visualization purposes. Our reconstruction results areunique. given in Figure 10. Notice that the method has been able to handle the first two spurious connections: the arm, head and belly are separated even though they were connected6. Experiments and Results to begin with. However, it had problems with the last one. We have implemented the proposed system using a 1024 This is expected since there is not enough geometry aroundx 768 DLP projector and a 1600 x 1200 CCD camera. Note the spurious link to force it to its correct position. In fact,that since we project a static pattern, we could have used there are only 2 intersections on one side of connection asa slide projector as well. We calibrated both the camera seen in Figure 9d. Comparing with the ground truth, we sawand the projector using the Camera Calibration Toolbox for that only these 2 intersections (out of 1015) were assignedMatlab [2]. The experiments were carried out under weak wrongly. This corresponds to 99.8% correct assignment.ambient light. The next object is a bust of Michelangelo’s David shown For the graphical model parameters, we have chosen bi- in Figure 1a. The captured intersection network is givennary edge prior λk = 0.95 ∀k since we expect spurious in Figure 1b. For this object, there are only two connectedconnections to happen rarely. As for the constant c in Equa- components. The first component contains only 23 pointstion 9, we have realized that a small nonzero value such as and spans the right arm whereas the second component con- 21
  8. 8. [2] J. Y. Bouguet. Camera calibration toolbox for matlab. http://www.vision.caltech.edu/bouguetj/calibdoc/index.html, 2001. [3] B. J. Brown, C. Toler-Franklin, D. Nehab, M. Burns, D. Dobkin, A. Vlachopoulos, C. Doumas, S. Rusinkiewicz, and T. Weyrich. A system for high-volume acquisition and matching of fresco fragments: Reassembling Theran wall paintings. ACM Transactions on Graphics SIGGRAPH, 27(3), Aug. 2008. [4] D. Caspi, N. Kiryati, and J. Shamir. Range imaging with adaptive color structured light. IEEE Trans. Pattern Anal. Mach. Intell., 20:470–480, 1998.Figure 10: Two views of the reconstruction using the pro- [5] H. Hiroshi Kawasaki, R. Furukawa, R. Sagawa, and Y. Yagi.posed method. Dynamic scene shape reconstruction using a single struc- tured light pattern. IEEE CVPR, June 2008. [6] S. Inokuchi, K. Sato, and F. Matsuda. Range imaging systemtains 1227 points and spans the body, head, the left arm and for 3-d object recognition. In IEEE ICIP, 1984.the background. There are 10 spurious connections inside [7] T. Koninckx and L. Van Gool. Real-time range acquisition bythe second component alone. These spurious connections adaptive structured light. IEEE Trans. Pattern Anal. Mach.link the head, the neck and the background. Since there is Intell., 28(3):432–445, March 2006.sufficient geometry around the spurious connections, all of [8] F. Kschischang, B. J. Frey, and H. andrea Loeliger. Factorthem are solved correctly. In fact, for this object we have graphs and the sum-product algorithm. IEEE Trans. on In-100% correct assignment. The reconstructions are shown formation Theory, 47:498–519, 2001.in Figures 1c and 1d. [9] M. Levoy, K. Pulli, B. Curless, S. Rusinkiewicz, D. Koller, As advocated, loopy belief propagation worked excellent L. Pereira, M. Ginzton, S. Anderson, J. Davis, J. Ginsberg,in practice. Typically, it took about 10 iterations for conver- J. Shade, and D. Fulk. The digital michelangelo project: 3Dgence and we haven’t experienced any cases where it did scanning of large statues. In ACM Transactions on Graphics SIGGRAPH, pages 131–144, July 2000.not converge. In unoptimized MATLAB code, the running [10] J. Pearl. Probabilistic reasoning in intelligent systems. Mor-time for a thousand grid intersections was about three min- gan Kaufmann, 1997.utes. [11] M. Proesmans, L. Van Gool, and A. Oosterlinck. Active ac- quisition of 3d shape for moving objects. IEEE ICIP, pages7. Conclusion and Future Work 647–650, 1996. [12] G. Ru and G. Stockman. 3-d surface solution using struc- In this paper, we have demonstrated a novel “one-shot” tured light and constraint propagation. IEEE Trans. Patternmethod to reconstruct 3D objects and scenes using a grid Anal. Mach. Intell., 11(4):390–402, 1989.pattern. Unlike previous grid based approaches, our formu- [13] J. Salvi, J. Batlle, and E. Mouaddib. A robust-coded pat-lation, which uses a powerful probabilistic graphical model, tern projection for dynamic 3d scene measurement. Patterncan deal with scenes with complicated geometry. We have Recognition Letters, 19:1055–1065, 1998.shown its performance in two such examples. The next step [14] J. Salvi, J. Pages, and J. Batlle. Pattern codification strategiesis to combine this system with an online registration method in structured light systems. Pattern Recognition, 37(4):827–and to replace the DLP projector we currently use with a 849, April 2004.slide or laser projector in order to build a hand-held scan- [15] E. B. Sudderth and W. Freeman. Signal and image process-ner. ing with belief propagation. Signal Processing Magazine, IEEE, 25:114–141, March 2008.8. Acknowledgments [16] A. O. Ulusoy, F. Calakli, and G. Taubin. One-shot scanning using de bruijn spaced grids. In Proceedings of the 3-D Dig- This material is based upon work supported by the Na- ital Imaging and Modeling (3DIM), 2009.tional Science Foundation under Grant No. IIS-0808718 [17] Y. Weiss and W. T. Freeman. On the optimality of solutionsand CCF-0729126. The authors would like to thank Erik B. of the max-product belief-propagation algorithm in arbitrarySudderth for many helpful discussions. graphs. IEEE Trans. on Information Theory, 47(2):736–744, 2001.References [18] L. Zhang, B. Curless, and S. Seitz. Rapid shape acquisi- tion using color structured light and multi-pass dynamic pro- [1] J. Batlle, E. Mouaddib, and J. Salvi. Recent progress in gramming. IEEE 3DPVT, 2002. coded structured light as a technique to solve the correspon- dence problem: A survey. Pattern Recognition, 1998. 22