Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Indoor Point Cloud Processing - Deep learning for semantic segmentation of indoor point clouds

1,017 views

Published on

Presentation by Petteri Teikari

Published in: Real Estate
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Indoor Point Cloud Processing - Deep learning for semantic segmentation of indoor point clouds

  1. 1. IndoorPointCloudProcessing Deep learningforsemanticsegmentation ofindoorpoint clouds
  2. 2. Implementation Initial ‘deep learning’ idea .XYZ point cloud better than the reconstructed .obj file for automatic segmentation due to higher resolution InputPointCloud 3D CAD MODEL No need to have planar surfaces Sampled too densely www.outsource3dcadmodeling.com 2DCAD MODEL Straightforward from 3D to 2D cadcrowd.com RECONSTRUCT 3D “Deep Learning” 3DSemantic Segmentation frompointcloud / reconstructed mesh youtube.com/watch?v=cGuoyNY54kU arxiv.org/1608.04236 Primitive-based deep learning segmentation The order between semantic segmentation and reconstruction could be swapped
  3. 3. Sensors Architectural spaces https://matterport.com/ Some Company could upgrade to? http://news.mit.edu/2015/object-recognition-robots-0724 https://youtu.be/m6sStUk3UVk http://news.mit.edu/2015/algorithms-boost-3-d-imaging-resolution-1000-times-120 1 + http://www.forbes.com/sites/eliseackerman/2013/11/17/
  4. 4. HARDWARE Existing scanners static Scan space eventually with a drone https://www.youtube.com/watch?v=dVPOf-oDUO M Introducing Cartographer We are happy to announce the open source release of Cartographer, a real-time simultaneous localization and mapping (SLAM) library in 2D and 3D with ROS support. SLAM algorithms combine data from various sensors (e.g. LIDAR, IMU and cameras) to simultaneously compute the position of the sensor and a map of the sensor’s surroundings. We recognize the value of high quality datasets to the research community. That’s why, thanks to cooperation with the Deutsches Museum (the largest tech museum in the world ), we are also releasing three years of LIDAR and IMU data collected using our 2D and 3D mapping backpack platforms during the development and testing of Cartographer. http://www.ucl.ac.uk/3dim/bim | http://www.homepages.ucl.ac.uk/~ucescph/ Indoor Mobile Mapping Rapid Data Capture for Indoor Modelling As part of a working group we are investigating the great potential of indoor mobile mapping systems for providing 3D capture of the complex and unique environment that exists inside buildings. The investigation is taking the form of a series of trials to explore the technical capabilities of Indoor Mobile Mapping Systems, such as the i-MMS from Viametris, with a view to performance in Survey and BIM applications with respect to UK standards. The working group is investigating the potential of such technology in terms of accuracies, economic viability and its future development.
  5. 5. Sensors Drone Scanning? http://dx.doi.org/10.3390/s150511551
  6. 6. Implementation rough Idea InputPointCloud CAD-Primitive based reconstruction Trained on ModelNet. CAD Primitives ModelNet modelnet.cs.princeton.edu Possibly only simplified modelling, with only walls, floor and openings http://dx.doi.org/10.1016/j.cag.2015.07.008 2D CADFLOORPLAN → .SVG FOR REAL ESTATE AGENTS
  7. 7. Point clouds to Architectural Models #1 Point-Cloud Processing with Primitive Shapes cg.cs.uni-bonn.de/en/projects UCL > School of BEAMS > Faculty of Engineering Science > Civil, Environmental and Geomatic Engineeri ng http://discovery.ucl.ac.uk/id/eprint/1485847 From Point Cloud to Building Information Model: Capturing and Processing Survey Data Towards Automation for High Quality 3D Models to Aid a BIM Process Thomson, CPH; (2016) From Point Cloud to Building Information Model: Capturing and Processing Survey Data Towards Automation for High Quality 3D Models to Aid a BIM Process. Doctoral thesis, UCL (University College London).
  8. 8. Point clouds to Architectural Models #2 Eric Turner, May 14, 2015 Electrical Engineering and Computer Sciences, University of California at Berkeley Technical ReportNo.UCB/EECS-2015-105 http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-105.html, Cited by 1 Figure 3.2: (a) Point cloud of scanned area, viewed from above and colored by elevation; (b) wall sample locations generated from point cloud. Clutter such as furniture or plants do not aect position of wall samples.
  9. 9. HIGH-level IDEA BuildingParser
  10. 10. First level Idea Input pointcloud (1,039,097vertices) 'Simplified'pointcloud By MarkusYlimäki 1) Noisyinputwith possible missing parts 2) Denoise, consolidate, find normalsand possibly upsample the pointcloud 3) Find planar surfaceswith semanticlabels(semanticsegmentation for point clouds) 1) optimally youwouldliketodescribeawalljustusing4cornerpoint massivereductionofpoints→ 4) Remove toocomplexshapeslike chairs, flowers, chandeliers, etc. whatever Oldschooltechniques,nomachinelearninghereyet Each color correspond to a plane, black correspond to no plane (1,039,097 vertices) This algorithm gives okay results, but could be a lot faster, and you never can have too robust method. And better ‘inpainting’ performance for missing data. PhD student at the Center for Machine Vision Research in the University of Oulu.
  11. 11. First level Pre-processing justuseexistingcode Point CloudDenoisingviaMovingRPCA EMattei,ACastrodad,2016-Computer GraphicsForum-WileyOnlineLibrary Walls become a lot better planar in top view Optimize consolidation for point clouds Screened PoissonReconstruction https://github.com/mkazhdan/PoissonRecon, C++ code) CGAL, PointSetProcessing http://doc.cgal.org/latest/Point_set_processing_3/ http://vcc.szu.edu.cn/research/2013/EAR/ Deep points consolidation -ACMDigital Library bySWu -2015- Cited by 2 - Related articles [webpage]   [pdf]   [video]   [ppt]    [code]   [data]    EAR/WLOPCODE AVAILABLE in CGAL as illustrated below Consolidationof Low qualityPoint‐ Clouds fromOutdoorScenes
  12. 12. First level Pre-processing Motivationforimage-naîvepeople https://www.youtube.com/watch?v=BlDl6M0go-c Background of BM3D (and later BM4D) developed at Tampere University of Technology, the state-of-the-art denoising algorithm at least prior to deep learning denoisers Images are always estimates of the “real images” like any measurement in general, and a photo of a black circle on a white background in practice for the computer might not be composed of only two colors. But in practice is corrupted by noise and blur and quantitative image analysis might be facilitated by some image restoration pre-processing algorithms. And we want to use ROBUST ALGORITHMS that perform well also with low-resolution and noisy point cloud (think of Google Tango scans or even more professional laser scanners/ LIDARs) Lu et al. (2016), https://doi.org/10.1109/TVCG.2015.2500222 “BM3D for point clouds” Patch-Collaborative Spectral Point-Cloud Denoising http://doi.org/10.1111/cgf.12139 You can visualize the removed noise with Hausdorff distance for example http://dx.doi.org/10.1111/cgf.12802 http://staff.ustc.edu.cn/~lgliu/Publications/Publications/2015_SMI_QualityPoint.pdf
  13. 13. First level reconstruction http://dx.doi.org/10.1111/cgf.12802
  14. 14. First level plane segmentation in practice #1 https://tams.informatik.uni-hamburg.de/people/alumni/xiao/publicati ons/Xiao_RAS2013.pdf junhaoxiao/ TAMS-Planar-Surface-Based-Perception 3DperceptioncodedevelopedatTAMS(http://tams.informatik.uni-hamburg.de/)by JunhaoXiaoandothers,includingpointcloudplanesegmentation,planar segment areacalculation,scanregistrationbasedonplanar segments,etc. The following libraries will help also if not everything is found from the implementation above ● CGAL 4.9 - Point Set Processing: User Manual ● PCL - Point Cloud Library (PCL) ● PDAL - Point Data Abstraction Library — pdal.io ● For ICC and BIM processing the VOLVOX plugin for Rhino seemed interesting https://github.com/DURAARK http://papers.cumincad.org/data/w orks/att/ecaade2016_171.pdf
  15. 15. First level semantic Segmentation Examples http://www-video.eecs.berkeley.edu/papers/elturner/elturner_3dv2015.pdf http://dx.doi.org/10.1007/978-3-319-48881-3_10 https://doi.org/10.1109/JSTSP.2014.2381153 http://dx.doi.org/10.1061/(ASCE)CP.1943-5487.0000556 https://doi.org/10.1145/2999508.2999526
  16. 16. First level plane segmentation in practice #2 http://dx.doi.org/10.1117/1.JEI.24.5.051008 Furthermore, various enhancements are applied to improve the segmentation quality. The GPU implementation of the proposed algorithm segments depth images into planes at the rate of 58 fps. Our pipeline-interleaving technique increases this rate up to 100 fps. With this throughput rate improvement, the application benefit of our algorithm may be further exploited in terms of quality and enhancing the localization
  17. 17. First level Shape representations Data-driven shape processing and modeling provides a promising solution to the development of “big 3D data”. Two major ways of 3D data generation, 3D sensing and 3D content creation, populate 3D databases with fast growing amount of 3D models. The database models are sparsely enhanced with manual segmentation and labeling, as well as reasonably organized, to support data-driven shape analysis and processing, based on, e.g., machine learning techniques. The learned knowledge can in turn support efficient 3D reconstruction and 3D content creation, during which the knowledge can be transferred to the newly generated data. Such 3D data with semantic information can be included into the database to enrich it and facilitate further data-driven applications. https://arxiv.org/abs/1502.06686
  18. 18. Synthesis Modular blocks as cloud microservices? POINTCLOUD 2DFloorplan 3DCADModel Denoising Consolidation Upsampling Planar Segmentation TAMS Simplification WLOP DeepPoints Bilateral CGAL Metafile ‘Imagerestoration’ pipeiline Not necessarily every block before planar segmentation is needed and ‘pre-processing’ could be bypassed Onlytoberunfromcloud? Startfromexistinglibrariesandimplementations? Seethedetailsfrom previousslides. Each block has codeavailableso no new codeneed to bewritten to get toMVP S H U F F L E GroundTruth Benchmarkperformance - Accuracy - Computation speed - Robustness
  19. 19. Second level to deep learning
  20. 20. General Motivation #1 Where VR is going beyond this project Carlos E. Perez, Software Architect - Design Patterns for Deep Learning Architectures Written Aug 9 Yes. (1) MagicLeap has known to be hiring Deep Learning experts for its Augmented Reality system. They are known to use Movidius as their chip which is a deep learning vision processor. (2) Gesture recognition can be done via deep learning. (3) Voice identification seems to have an importance in a VR context. See: Design Patterns for Deep Learning Architectures : Applications https://techcrunch.com/2016/10/28/magic-leap-goes-to-finland-in-pursuit-of-nor dic-vr-and-ar-talent/ http://www.forbes.com/sites/davidewalt/2016/11/02/inside- magic-leap-the-secretive-4-5-billion-startup-changing-com puting-forever/#2f9365e5e83f https://www.wired.com/2016/04/magic-leap-vr/
  21. 21. General Motivation #2 Where 3D is going beyond this project http://jobsearch.scania.com/segerjoblist/presentation.aspx?presGrpId=9470&langId=1&ie=False http://www.sensorsmag.com/seventh-sense-blog/artificial-intelligence-autonomous-driving-24333 Viorica Pătrăucean, Ph.D: "BIM for existing infrastructure" http://www-smartinfrastructure.eng.cam.ac.uk/files/generating-bim-models-for-existing-assets
  22. 22. General Motivation #2b Where 3D: Autonomous driving https://www.youtube.com/watch?v=4zOqJK-_GAk Automatic object detection and removal from 3D point clouds byOxbotica Francis Engelmann, Jörg Stückler and Bastian Leibe Computer Vision Group, RWTH Aachen University https://www.youtube.com/watch?v=YebCdz7QsRs
  23. 23. General Motivation #2C Where 3D: building information models (BIM) http://www.spar3d.com/news/lidar/paracosms-new-handheld-lidar-scanner-built-construction-analytics/ GeoSLAM is playing into this trend with the release of their ZEB-CAM, an add-on for the company’s ZEB-REVO handheld indoor mapper that captures imagery at the same time as 3D scan data. The data captured by the two sensors is fully synchronized, and users can view the results side by side in GeoSLAM’s desktop software. Click a spot in the scan, and the associated imagery is displayed. Click a spot in the imagery, and the associated scan data is displayed.
  24. 24. General Motivation #3 Where AR is going beyond this project http://adas.cvc.uab.es/varvai2016/ This half-day workshop will include invited talks from researchers at the forefront of modern synthetic data generation with VAR for VAI ● Learning Transferable Multimodal Representations in VAR, e.g., via deep learning ● Virtual World design for realistic training data generation ● Augmenting real-world training datasets with renderings of 3D virtual objects ● Active & reinforcement learning algorithms for effective training data generation and accelerated learning Xcede’s Data Science team are collaborating with one of the world’s foremost image recognition and augmented reality platforms. Already working with some of the world's top brands, including Pepsi, Coca- Cola, Procter & Gamble, General Mills, Anheuser- Busch, Elle, Glamour, Honda and BMW their mobile app has been downloaded over 45 million times. Our client is now looking for a Computer Vision Researcher to join their Deep Learning R&D team who can help bring their technology to the next level. http://www.eetimes.com/author.asp?section_id=36&doc_id=1330958 https://arxiv.org/pdf/1605.09533v1.pdf
  25. 25. NIPS 2016: 3D Workshop Deep learning is proven to be a powerful tool to build models for language (one-dimensional) and image (two-dimensional) understanding. Tremendous efforts have been devoted to these areas, however, it is still at the early stage to apply deep learning to 3D data, despite their great research values and broad real- world applications. In particular, existing methods poorly serve the three-dimensional data that drives a broad range of critical applications such as augmented reality, autonomous driving, graphics, robotics, medical imaging, neuroscience, and scientific simulations. These problems have drawn the attention of researchers in different fields such as neuroscience, computer vision, and graphics. The goal of this workshop is to foster interdisciplinary communication of researchers working on 3D data (Computer Vision and Computer Graphics) so that more attention of broader community can be drawn to 3D deep learning problems. Through those studies, new ideas and discoveries are expected to emerge, which can inspire advances in related fields. This workshop is composed of invited talks, oral presentations of outstanding submissions and a poster session to showcase the state-of-the-art results on the topic. In particular, a panel discussion among leading researchers in the field is planned, so as to provide a common playground for inspiring discussions and stimulating debates. The workshop will be held on Dec 9 at NIPS 2016 in Barcelona, Spain. http://3ddl.cs.princeton.edu/2016/ ORGANIZERS ● Fisher Yu - Princeton University ● Joseph Lim - Stanford University ● Matthew Fisher - Stanford University ● Qixing Huang - University of Texas at Austin ● Jianxiong Xiao - AutoX Inc. http://cvpr2017.thecvf.com/ In Honolulu, Hawaii “I am co-organizing the 2nd Workshop on Visual Understanding for Interaction in conjunction with CVPR 2017. Stay tuned for the details!” “Our workshop on Large- Scale Scene Under- standing Challenge is accepted by CVPR 2017.
  26. 26. Labeling 3d Spaces Semantic Part Manually labeling 3D scans → way too time consuming! https://arxiv.org/abs/1511.03240 SynthCam3D is a library of synthetic indoor scenes collected from various online 3D repositories and hosted at http://robotvault.bitbucket.org https://arxiv.org/abs/1505.00171 SYNTHETIC DATA The advantages of synthetic 3D models cannot be overstated, especially when considering scenes: once a 3D annotated model is available, it allows rendering as many 2D annotated views as desired, Samples of annotated images rendered at various camera poses for an office scene taken from SynthCam3D youtube.com/watch?v=cGuoyNY54kU Existing datasets NYUv2
  27. 27. SYNTHETIC Datasets #1 SynthCam3D is a library of synthetic indoor scenes collected from various online 3D repositories and hosted at http://robotvault.bitbucket.org. Large public repositories (e.g. Trimble Warehouse) of 3D CAD models have existed in the past, but they have mainly served the graphics community. It is only recently that we have started to see emerging interest in synthetic data for computer vision. The advantages of synthetic 3D models cannot be overstated, especially when considering scenes: once a 3D annotated model is available, it allows rendering as many 2D annotated views as desired, at any resolution and frame-rate. In comparison, existing datasets of real data are fairly limited both in the number of annotations and the amount of data. NYUv2 provides only 795 training images for 894 classes; hence learning any meaningful features characterising a class of objects becomes prohibitively hard. https://arxiv.org/abs/1505.00171
  28. 28. SYNTHETIC Datasets #2 Creating large datasets with pixelwise semantic labels is known to be very challenging due to the amount of human effort required to trace accurate object boundaries. High-quality semantic labeling was reported to require 60 minutes per image for the CamVid dataset and 90 minutes per image for the Cityscapes dataset. Due to the substantial manual effort involved in producing pixel-accurate annotations, semantic segmentation datasets with precise and comprehensive label maps are orders of magnitude smaller than image classification datasets. This has been referred to as the “curse of dataset annotation”: the more detailed the semantic labeling, the smaller the datasets. Somewhat orthogonal to our work is the use of indoor scene models to train deep networks for semantic understanding of indoor environments from depth images [ 15, 33]. These approaches compose synthetic indoor scenes from object models and synthesize depth maps with associated semantic labels. The training data synthesized in these works provides depth information but no appearance cues. The trained models are thus limited to analyzing depth maps. 15 SynthCam3D previous slide 33
  29. 29. Deep Learning Problems Data columns: x, y, z, red, green, blue Pointclouds can be huge • Voxelization of the scene impossible in practice without severe downsampling / discretization • Mesh/surface reconstruction increases the data amount as well How to handle massive datasets in deep learning? Simplify (primitive-based reconstruction) before semantic segmentation? https://github.com/btgraham/SparseConvNet https://ei.is.tuebingen.mpg.de https://arxiv.org/abs/1605.06240 This can be used to analyse 3D models, or space-time paths. Here are some examples from a 3D object dataset. The insides are hollow, so the data is fairly sparse. The computational complexity of processing the models is related to the fractal dimension of the underlying objects. https://arxiv.org/abs/1503.04949 https://github.com/MPI-IS/bilateralNN doi:10.1111/j.1467-8659.2009.01645.x 1 2 3 Can't use 3D CNNs Try alternative schemes no normals
  30. 30. Point clouds with deep learning: example with Normals Eurographics Symposium on Geometry Processing 2016, Volume 35 (2016), Number 5 http://dx.doi.org/10.1111/cgf.12983 Convolutional neural networks Work on normal estimation with CNNs focus on using as input RGB images, or possibly RGB-D, but not sparse data such as unstructured 3D point clouds. CNN-based techniques have been applied to 3D data though, but with a voxel-based perspective, which is not accurate enough for normal estimation. Techniques to efficiently apply CNN-based methods to sparse data have been proposed too [Gra15], but they mostly focus on efficiency issues, to exploit sparsity; applications are 3D object recognition, again with voxel- based granularity, and analysis of space-time objects. An older, neuron-inspired approach [JIS03] is more relevant to normal estimation in 3D point clouds but it actually addresses the more difficult task of meshing. It uses a stochastic regularization based on neighbors, but the so-called “learning process” actually is just a local iterative optimization. CNNs can also address regression problems such as object pose estimation [PCFG12]. These same properties seem appropriate as well for the task of learning how to estimate normals, including in the presence of noise and when several normal candidates are possible near sharp features of the underlying surface The question, however, is how to interpret the local neighborhood of a 3D point as an image-like input that can be fed to a CNN. If the point cloud is structured, as given by a depth sensor, the depth map is a natural choice as CNN input. But if the point cloud is unstructured, it is not clear what to do. In this case, we propose to associate an image-like representation to the local neighborhood of a 3D point via a Hough transform. In this image, a pixel corresponds to a normal direction, and its intensity measures the number of votes for that direction; besides, pixel adjacency relates to closeness of directions. It is a planar map of the empirical probability of the different possible directions. Then, just as a CNN for ordinary images can exploit the local correlation of pixels to denoise the underlying information, a CNN for these Hough-based direction mapsmight also be able to handle noise, identifying a flat peak around one direction. Similarly, just as a CNN for images can learn a robust recognizer, a CNN for direction maps might be able to make uncompromising decisions near sharp features, when different normals are candidate, opting for one specific direction rather than trading off for an average, smoothed normal. Moreover, outliers can be ignored in a simple way by limiting the size of the neighborhood, thus reducing or preventing the influence of points lying far from a more densely sampled surface Makes computationally feasible
  31. 31. Literature Indoor point cloud segmentation with deep learning http://robotvault.bitbucket.org/scenenet-rgbd.html http://delivery.acm.org/10.1145/3020000/3014008 https://pdfs.semanticscholar.org/1ce8/1a2c8fa5731db944bfb57c9e7e8eb0fc5bd2.pdf https://arxiv.org/pdf/1612.00593v1.pdf
  32. 32. Unsupervised deep learning segmentation http://staff.ustc.edu.cn/~lgliu/ http://dx.doi.org/10.1016/j.cagd.2016.02.015 - cited by
  33. 33. Second level deep learning in Practice btgraham/SparseConvNet C++Spatially-sparseconvolutionalnetworks.Allowsprocessingof sparse2,3and4dimensionaldata.BuildCNNsonthe square/cubic/hypercubicor triangular/tetrahedral/hyper-tetrahedral lattices gangiman/PySparseConvNet Pythonwrapper for SparseConvNet in practice http://3ddl.cs.princeton.edu/2016/slides/notchenko.pdf Update oldschoolmachine learningapproachtomodern deeplearning.Reconstruct theplanarshapes usinga databaseofCADmodels( ModelNet)? Requiressomeworkforsure http://staff.ustc.edu.cn/~juyong/DictionaryRecon.html MOTIVATION 3dmodel_feature Code for extracting 3dcnn features of CAD models
  34. 34. Point cloud pipeline 2nd Step, “Deeplearnify” Denoising Consolidation Upsampling Planar Segmentation Simplification 2D Unstructured3D Roughcorrespondencesfrommoreestablished2DDeepLearningWorld btgraham/SparseConvNet gangiman/PySparseConvNet PythonwrapperforSparseConvNet 3dmodel_feature Code for extracting 3dcnn features of CAD models Sparselibrariesonlyasstartingpoints https://arxiv.org/abs/1503.04949 https://github.com/MPI-IS/bilateralNN http://arxiv.org/abs/1607.02005 Andrew Adams, Jongmin Baek, Myers Abraham Davis May 2010, http://dx.doi.org/10.1111/j.1467-8659.2009.01645.x
  35. 35. 3D SHAPE representations #1: VRN EnsembleModelNet40: 95.54% Accuracy – The STATE-OF-THE-ART! For this work, we select the Variational Autoencoder (VAE), a probabilistic framework that learns both an inference network to map from an input space to a set of descriptive latent variables, and a generative network that maps from the latent space back to the input space. Our model, implemented in Theano with Lasagne comprises an encoder network, the latent layer, and a decoder network, as displayed in Figure 1. https://arxiv.org/abs/1608.04236 https://github.com/ajbrock/Generative-and-Discriminative-Voxel-Modeling
  36. 36. 3D SHAPE representations #2: Probing filters https://arxiv.org/abs/1605.06240 https://github.com/yangyanli/FPNN Created by Yangyan Li, Soeren Pirk, Hao Su, Charles Ruizhongtai Qi, and Leonidas J. Guibas from Stanford University. Building discriminative representations for 3D data has been an important task in computer graphics and computer vision research. Unfortunately, the computational complexity of 3D CNNs grows cubically with respect to voxel resolution. Moreover, since most 3D geometry representations are boundary based, occupied regions do not increase proportionately with the size of the discretization, resulting in wasted computation. In this work, we represent 3D spaces as volumetric fields, and propose a novel design that employs field probing filters to efficiently extract features from them. Our learning algorithm optimizes not only the weights associated with the probing points, but also their locations, which deforms the shape of the probing filters and adaptively distributes them in 3D space. The optimized probing points sense the 3D space “intelligently”, rather than operating blindly over the entire domain. We show that field probing is significantly more efficient than 3DCNNs, while providing state-of- the-art performance, on classification tasks for 3D object recognition benchmark datasets
  37. 37. Point cloud pipeline in practice #1 Serialpipeline Simplify PlanarSegmentation Objectdetection Rares Ambrus Robotics Perception and Learning (RPL) KTH,
  38. 38. Point cloud pipeline in practice #2 Jointpipeline http://dx.doi.org/10.1016/j.neucom.2015.08.127 http://ai.stanford.edu/~quocle/tutorial2.pdf
  39. 39. APPENDIX Literature Extra on ‘Old school’ methods
  40. 40. Point Cloud Processing #1 http://www.thomaswhelan.ie/Whelan14ras.pdf | http://dx.doi.org/10.1016/j.robot.2014.08.019 http://www.cs.nuim.ie/research/vision/data/ras2014 "Incremental and Batch Planar Simplification of Dense Point Cloud Maps" by T. Whelan, L. Ma, E. Bondarev, P. H. N. de With, and J.B. McDonald in Robotics and Autonomous Systems ECMR ’13 Special Issue, 2014. https://www.youtube.com/watch?v=uF-I-xF3Rk0 3DReshaper® is a tool to process 3D point clouds wherever they come from: 3D scanners, laser scanning, UAVs, or any other digitization device... Whatever your point cloud processing challenges are 3DReshaper has the tools you need. You can import one or several point clouds whatever their origin and size. Point cloud preparation is often the most important step to handle in order to save time with the subsequent steps (i.e. meshing). That is why 3DReshaper provides a complete range of simple but powerful functions to process point clouds like: • Import without real limit of the imported number of points • Clever reduction to keep best points and remove points only where density is the highest • Automatic segmentation • Automatic or manual separation and cleaning • Extraction of best points evenly spaced, density homogenization • Automatic noise measurement reduction • Colors according to a given direction • Fusion • Registration, Alignment and Best Fit • 3D comparison with a mesh or a CAD model • Planar sections • Best geometrical shapes extraction (planes, cylinders, circles, spheres, etc.) • Several representation modes: textured, shaded, intensity (information depending on the imported data) 3dreshaper.com
  41. 41. Point Cloud Processing #2 Notice that in Fig.5a, the baseline data structure uses nearly 300MB of RAM whereas the spatial hashing data structure never allocates more than 47MB of RAM for the entire scene, which is a 15 meter long hallway. Memory usage statistics (Fig. 5b) reveal that when all of the depth data is used (including very far away data from the surrounding walls), a baseline fixed grid data structure (FG) would use nearly 2GB of memory at a 2cm resolution, whereas spatial hashing with 16 × 16 × 16 chunks uses only around 700MB. When the depth frustum is cut off at 2 meters (mapping only the desk structure without the surrounding room), spatial hashing uses only 50MB of memory, whereas the baseline data structure would use nearly 300MB. We also found that running marching cubes on a fixed grid rather than incrementally on spatially-hashed chunks to be prohibitively slow robotics.ccny.cuny.edu
  42. 42. Point cloud Segmentation Introduction http://dx.doi.org/10.1016/j.cag.2015.11.003 There are three kinds of methods for point cloud segmentation [14]. The first type is based on primitive fitting [3], [15] and [5]. It is hard for these methods to deal with objects with complex shape. The second kind of techniques is the region growing method. Nan et al. [2] propose a controlled region growing process which searches for meaningful objects in the scene by accumulating surface patches with high classification likelihood. Berner et al. [16] detect symmetric regions using region growing. Another line of methods formulates the point cloud segmentation as a Markov Random Field (MRF) or Conditional Random Field (CRF) problem [4], [17] and [14]. A representative random field segmentation method is the min-cut algorithm [17]. The method extracts foreground from background through building a KNN graph over which min-cut is performed. The shortcoming of min-cut algorithm is that the selection of seed points relies on human interaction. We extend the min-cut algorithm by first generating a set of object hypotheses via multiple binary min-cuts and then selecting the most probable ones based on a voting scheme, thus avoiding the seed selection. Plane extraction from the point cloud of a tabletop scene by using our method (a) and RANSAC based primitive fitting (b), respectively. While our method can segment out the supporting plane accurately, RANSAC missed some points due to the thin objects. An overview of our algorithm. We first over-segment the scene and extract the supporting plane on the patch graph, then segment the scene into segments and represent the whole scene using a segment graph (a). To obtain the contextual information, we train a set of classifiers for both single objects and object groups using multiple kernel learning (b). The classifiers are used to group the segments into objects or object groups (c).
  43. 43. Point cloud Segmentation Example #1 http://dx.doi.org/10.1016/j.neucom.2015.12.101
  44. 44. Point cloud Segmentation Example #2 http://dx.doi.org/10.1109/TGRS.2016.255154 6 Principal component analysis (PCA)-based local saliency features, e.g., normal and curvature, have been frequently used in many ways for point cloud segmentation. However, PCA is sensitive to outliers; saliency features from PCA are non-robust and inaccurate in the presence of outliers; consequently, segmentation results can be erroneous and unreliable. As a remedy, robust techniques, e.g., RANdom SAmple Consensus (RANSAC), and/or robust versions of PCA (RPCA) have been proposed. However RANSAC is influenced by the well-known swamping effect, and RPCA methods are computationally intensive for point cloud processing. We propose a region growing based robust segmentation algorithm that uses a recently introduced maximum consistency with minimum distance based robust diagnostic PCA (RDPCA) approach to get robust saliency features. Many methods have been developed to improve the quality of segmentation in PCD that can be grouped into three main categories: 1) edge/border based; 2) region growing based; and 3) hybrid. In edge/border based methods, points on edges/ borders are detected, a border linkage process constructs the continuous edge/border, and then points are grouped within the identified boundaries and connected edges. Castillo et al. [14] stated that, due to noise or uneven point distributions, such methods often detect disconnected edges, which make it difficult for a filling or an interpretation procedure to identify closed segments.
  45. 45. Point cloud Segmentation Example #3 http://dx.doi.org/10.3390/rs5020491 In the future, we will validate the method further with a large number of trunk and branch measurements from real trees. We will also develop the method further, e.g., by utilizing generalized cylinder shapes. Together with the computational method presented, laser scanning provides a fast and efficient means to collect essential geometric and topological data from trees, thereby substantially increasing the available data from trees. https://www.youtube.com/watch?v=PKHJQeXJEkU
  46. 46. Point cloud Segmentation Example #4 http://dx.doi.org10.1016/j.isprsjprs.2015.01.016 http://dx.doi.org/10.1016/j.cag.2016.01.004
  47. 47. Cad Primary fitting traditional methods #1 http://dx.doi.org/10.1111/j.1467-8659.2007.01016.x; Cited by 680 http://dx.doi.org/10.1111/j.1467-8659.2009.01389.x; Cited by 63 [SDK09][SWK07] http://dx.doi.org/10.1111/cgf.12802
  48. 48. Cad Primary fitting traditional methods #2 http://dx.doi.org/10.2312/egsh.20151001 http://dx.doi.org/10.1016/j.cag.2014.07.0050 The main phases of our algorithm: from the input model (a) we robustly extract candidate walls (b). These are used to construct a cell complex in the 2D floor plane. From this we obtain a partitioning into individual rooms (c) and finally the individual room polyhedra (d). Note that in (a) the ceiling has been removed for the sake of visual clarity.
  49. 49. Cad Primary fitting traditional methods #3 3D Architectural Modeling: Coarse-to-fine model fitting on point cloud Reema Bajwa, Syed Rizwan Gilani, Murtaza Taj Proceeding CGI '16 Proceedings of the 33rd Computer Graphics International. Pages 65-68. ACM New York, NY, USA ©2016 doi>10.1145/2949035.2949052 http://hdl.handle.net/11250/2402578 Existing work in architectural modeling using point cloud tend to perform shape fitting on 3D data directly. We propose three fundamental projections of a point cloud that simplify the shape fitting process. We present several parametric objects to represent architectural elements, how- ever contrary to all of the parameters are found automatically using the proposed projections. This results in an automatic framework for coarse-to-fine primitive modeling of architecture that improves the results considerably over the existing approaches.
  50. 50. Cad Primary fitting traditional methods #4 http://dx.doi.org/10.1080/17538947.2016.1143982 http://dx.doi.org/10.1007/s41095-016-0041-9 An overview of the proposed approach. Starting from an imperfect point cloud (a) of a building, we first extract and refine planar segments (b) from the point cloud, and build a dense mesh model using existing techniques. Then, we use the extracted planar segments to partition the space of the input point cloud into axis-aligned cells (i.e. candidate boxes). (d) shows the overlay of the candidate boxes on the dense mesh model. After that, appropriate boxes (e) are selected based on binary linear programming optimization. Finally, a lightweight 3D model (f) is assembled from the chosen boxes. We conclude by observing that state of the art methods for quadric fitting give reasonable results on noisy point clouds. Our algorithm provides a means to enforce a prior, allowing the algorithm to better fit the quadric that the points were drawn from, particularly when there is missing data or a large amount of noise. This has a practical use for real datasets, since it allows a user to specify the curvature of the surface where there are few points available.
  51. 51. Cad Primary fitting traditional methods #5 http://dx.doi.org/10.1109/TVCG.2015.2461163 http://dx.doi.org/10.1016/j.cviu.2016.06.004
  52. 52. Cad Primary fitting traditional methods #6 URN: urn:nbn:se:kth:diva-173832 http://dx.doi.org/10.1111/cgf.12720
  53. 53. PCL Point Cloud Library http://pointclouds.org/ What is it? The Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing. PCL is released under the terms of the BSD license, and thus free for commercial and research use. We are financially supported by a consortium of commercial companies, with our own non-profit organization, Open Perception. We would also like to thank individual donors and contributors that have been helping the project. Presentations The following list of presentations describes certain aspects or modules in PCL, and have been assembled by former research interns at Willow Garage. • Ryohei Ueda's presentation on Tracking 3D objects with Point Cloud Library (more details) • Jochen Sprickerhof's presentation on Large Scale 3D Point Cloud Mapping in PCL (more details) • Aitor Aldoma's presentation on Clustered Viewpoint Feature Histograms (CVFH) for object recognition ( more details) • Julius Kammerl's presentation on Point Cloud Compression (more details) • Dirk Holz's presentation on the PCL registration framework (pre 1.0) (more details) • Rosie Li's presentation on surface reconstruction (more details) • Kai Wurm's presentation on 3D mapping with octrees (more details)
  54. 54. Point cloud Surface Reconstruction→ http://dx.doi.org/10.1111/cgf.12802
  55. 55. Mesh Simplification Huang et al. 2013 http://web.siat.ac.cn/~huihuang/EAR/EAR_page.html CGAL, Point Set Processing http://doc.cgal.org/latest/Point_set_processing_3/ Wei et al. (2015) Borouchaki and Frey (2005) QSlim Simplification Software, http://www.cs.cmu.edu/~./garland/quadrics/qslim.html Monette-Theriault (2014): “he Matlab wrapper of QSlim is adapted from [19] and the same platform is ... Using the initial mesh as a reference”
  56. 56. Mesh “re-Reconstruction” #2 This article presents a novel approach for constructing manifolds over meshes. The local geometry is represented by a sparse representation of a dictionary consisting of redundant atom functions. A compatible sparse representation optimization is proposed to guarantee the global compatibility of the manifold. Future work. The Sparse-Land model generalized on the manifold is fascinating because of its universality and flexibility, which make many of the geometric processing tasks clear and simple, and the superior performance to which it leads in various applications. As with sparse coding in image processing, we can also apply our framework of compatible sparse representations to various tasks in geometric processing, e.g., reconstruction, inpainting, denoising, and compression. We believe that some of the extensions are feasible but not straightforward. http://dx.doi.org/10.1111/cgf.12821 | https://www.youtube.com/watch?v=jhgjiQoQxa0 First published: 27 May 2016: www.researchgate.net
  57. 57. Mesh “re-Reconstruction” #4: Sparsity http://dx.doi.org/10.1016/j.cad.2016.05.013 https://arxiv.org/abs/1411.3230 http://arxiv.org/abs/1505.02890 Data is sparse if most sites take the value zero. For example, if a loop of string has a knot in it, and you trace the shape of the string in a 3D lattice, most sites will not form part of the knot (left). Applying a 2x2x2 convolution (middle), and a pooling operation (right), the set of non-zero sites stays fairly small: http://dx.doi.org/10.1016/j.conb.2004.07.007 http://dx.doi.org/10.1016/j.tins.2015.05.00 5
  58. 58. CGAL Computational Geometry Algorithms Library CGAL is a software project that provides easy access to efficient and reliable geometric algorithms in the form of a C++ library. CGAL is used in various areas needing geometric computation, such as geographic information systems, computer aided design, molecular biology, medical imaging, computer graphics, and robotics. The library offers data structures and algorithms like triangulations, Voronoi diagrams,Boolean operations on polygons and polyhedra, point set processing, arrangements of curves, surface and volume mesh generation, geometry processing, alpha shapes, convex hull algorithms, shape analysis, AABB and KD trees... Learn more about CGAL by browsing through the Package Overview. http://www.cgal.org/

×