Automated Image andPoint Cloud Interpretation Techniques and Applications Translating Human Visual Interpretations intoAlgorithms in Spectral, Morphological, and Contextual Feature Extraction Mike Bularz Prof. Robert J. Hasenstab GEOG – Independent Study
CONTEXT – POINT CLOUDS AND BIG DATA TRENDSThe role of large datasets in Information Technology The last few years in the world of information technology have seen an explosion in the amountof data being collected about the world around us, with limited realization of the full potenti Onemanifestation of demand for being able to process large datasets lies in the world of remote sensing,image and point cloud interpretation, and computer vision. This is particularly true in many industries orsectors interested in monitoring and modeling precise aspects of the natural environment that requirevery particular processes to make sense of complex sets of sensed information. . The focus of thisdocument is to examine point cloud processing, particularly of LiDAR (Light Detecting and Ranging) dataand the fusion of these measurements with imagery. The goal here is to examine a few processingalgorithm types in the abstract, and discuss potential applications of these processes for featureextraction in a geospatially enabled environment.Point Clouds and LiDAR processing Demand Government, Intelligence, Natural Resources (extraction or management), as well as marketingand consumer electronics companies are seeking to exploit large data sets such as point clouds, and thecomputer algorithms to process them to make technology more interactive or information moreexploitable. Government agencies increasingly are seeking out ﬁner scale data about the builtenvirionment and ways to process this data into actionable information products. For example, LiDAR isbeing employed in ﬂoodplain mapping, building footprint delineation, bridge monitoring, as well asmore abstract interpretations such as determining rooftop areas suitable for solar panel installation,modeling of noise in urban environments, and security applications such as crowd evacuations orline-of-sight calculations. Google’s Autonomous Vehicles employ point cloud processing of live laserrange-ﬁnding with LiDAR technology,4 and has begun road testing vehicles due to passage of enablinglegislature in California.5 These vehicles present one example of the private sector pushing for rapid andautomated processing of large point clouds. The robotics community has been working out a standard ofpoint cloud processing to enable piloting of robots in environments. Another potential application of
point cloud processing in the private sector includes drones, which are expected to soon populate ourairspace under new FAA (Federal Aviation Administration) regulation and guidelines.6HUMAN INTERPRETATION TO PROCESSING AND SEGMENTATIONLiDAR Processing – Popular Approaches The explosion of LiDAR data, particularly in the geospatial community in the last 5 to 7 years hasprompted a search to streamline feature extraction while maintaining accuracy. A few commonmethodologies to feature extraction from LiDAR have culminated from these eﬀorts, and are theunderlying processes in popular tools and software extensions claiming to automate or semi-automatethe feature extraction process. It is worth noting, LiDARdata in the geospatial industry andrelated professions is currentlycategorized, in terms of processingapproach and application based onits collection method: Airborne andTerrestrial.7 Airborne LiDAR can be DSM (Digital Surface Model) nDSM (Normalized Digitalpoints sensed from overhead in a Surface Model)ﬁxed-wing aircraft ﬂyover, which yields larger areas of data collection, or from a closer range fromhelicopter ﬂyovers, which yields much higher resolutions of points. Terrestrial LiDAR can be deployedfrom a ﬁxed position on a tripod system, or from having the sensors aﬃxed to vehicles, boats, andremote-control sensing robots. Each data collection method is dependent on the necessary precision ofthe application, and has a diﬀerent common processing approach.Height-based segmentation The most common approach, and most easily implementable, is a height-based segmentation offeatures. This is most commonly applied for data collected from ﬁxed-wing and helicopter aerialﬂyovers, but can be applied to terrestrial point clouds as well. Height based segmentation relies on
vendor pre-classiﬁed point clouds into bare-earth, and the structures above the earth at minimum. Theclassiﬁcation is based on interpolated surface models from the points.The bare earth surface model,which is the Earth’s surface elevation of the ground without any of the structures or vegetation, isreferred to commonly as a DEM or DTM (Digital Elevation Model / Digital Terrain Model). All pointscollected are commonly referred to as DSM (Digital Surface Model) and incorporate the bare earth,vegetation, structures, and other features captured. The processing approach itself is simple : subtract the bare earth surface (DEM) from the DSM toobtain a normalized DSM (nDSM). This produces a “heat map” of structures above the ground, with theheight values representing relative distance from ground. It is useful in classifying trees, buildings, powerlines, and other infrastructure, but does not segment the diﬀerent types alone. The process is essentiallyfully automated.Shape – FittingFitting shapes to a point cloud involve one of either: Determining how “rectangular” TINs (TriangularIrregular Networks, or a triangulated surface from points) is, and segmenting out manmade structuresfrom vegetation, or in line ﬁtting to determine roads and edges of manmade structures. A third, butrelated variation of this is a semi-automated process of drawing lines by hand into point clouds, usingsnapping or ﬁtting user interfaces.89Both processes rely on Morphological interpretation of the shapes of interpolated surfaces, which isexplained further in the context of many processing approaches in the following section.Spectral Fusion / Intensity and RGB ValuesMore ambitious classiﬁcations utilized the laser’s collected “intensity” of the reﬂected laser beam (if thevendor has included it in the attributes), RGB values collected by the vendor, or burning in bands fromimagery into the points or vice versa, burning in intensity into an image band. These are spectralclassiﬁcations, as they rely on spectral characterisitics of a remotely sensed target, either by imagecollection (passive remote sensing) or point intensity (active sensing). Common application for this isseen in biology and forestry, where it is useful to use spectral information to discern various tree andshrub species.
Point Cloud and Image Processing in a “Computer Vision” EnvironmentRecently, there have been developments in much more advanced and eﬀective ways to process pointcloud information and imagery. Some of the methods stem from LiDAR processing software, whileothers are being transplanted from the general image processing community. The goal of these softwaretools and packages is to provide a means by which to translate “human vision”, or how we segmentparts of an image of point cloud that we perceive, into computer algorithms to replicate thesegmentation power of the human mind. There are several general categories of how we perceive ourenvironment through visual cues.Image-based visual clues The ﬁrst types of visual clues are from actual value statements about objects: color / tone,morphology (shape, height, size), texture (roughness / smoothness). The second type are essentiallyauxillary clues: shadows, context (ex. These brown-colored objects are house roofs because they arenear a road. Ex. Green blob is an island because it is surrounded by water, whereas the green aroundthe lake is regular land), and pattern (there is a cluster of houses here, and roads, therefore this is aresidential area, and the largest structure is probably a school).Vector and Second-order visual clues The second type of visual interpretations are based oﬀ of deﬁned shapes. Deﬁned shapes areobjects we have recognized as true, such as vectors delineating what may be houses, or our mentaloutline of the various perceived features. Theoretically, vector-based visual clues are typicallysecond-order visual clues in interpretation, as we are deﬁning an object based on ﬁrst order clues ofmorphology, color, pattern, texture, or context and then analyzing the vector’s or deﬁned-object’ sproperties. Interpretations of vectors can range from analyzing (mentally or in software) the number ofangles, sides, general shape, area, jaggedness and roughness of edges vs. smoothness, orientation,symmetry, and contextual analyses based on other vectors.ALGORITHMS, MANIFESTATION AND AVAILABILITY IN PROCESSINGAND SOFTWARE PACKAGES
Image Based AlgorithmsTrying to translate visual cues from image based-perceptions relies on manipulating data in a formatrepresentative of its color, texture, brightness, etc. This can be done in vectors representing a theaverage values of these at an area, but the primary method for interpreting these characteristics israster, or gridded image based.SpectralMany of the concepts in image-based interpretation stem from classiﬁcation methods used onremotely-sensed satellite images. They vary from unsupervised classiﬁcations (ex. Specifying a spectralparameter, that Near-infrared and overall brightness will be lower in water, therefore classify all pixelsas water that meet this requirement) to supervised, where samples of features such as grass, trees,roads, and water bodies are traced as spectral samples throughout an image, and a statistical grouping Slope (Perspective) Slope (Aerial)based oﬀ of the modeled samples is performed.SlopeThere are some particularly unique image-based processing techniques that apply to point clouds,though, and these are all based oﬀ of interpolated elevation models’ characteristics. One example ofthis is by grouping pixels by slope, by calculating the slope of each pixel in an interpolated point cloudelevation grid, and selecting the highest slopes. This is a translation of how the human visualinterpretation segments buildings and structures out of an image – by determining where the slope isthe highest we deﬁne the rough location of walls, and tall trees.
CurvatureA further classiﬁcation method is based oﬀ of texture – often referred to as “curvature” of the elevationmodel. By calculating the change in values in the nDSM pixels, to assess how “curvy” or “bumpy” asurface is we can classify out certain features much easier: Trees and shrubbery have much highercurvatures that mandmade structures such as houses.AspectAspect, or a calculation of the average orientation of the pixel, is similar to slope, but can be used tosegment features based on general orientation. For instance, house walls will be typically orientedtowards roads, and combining this contextual information can help place seeds (house walls facingroads) by which to grow into features (houses).Vector based AlgorithmsBy deriving vectors representing eitherfeatures, or the rough area of features usingimage segmentation algorithms, we can furtherattempt to locate certain features by deﬁnitecharacteristics such as feature size (area), Curvature (Perspective)shape (rectangular, smooth, bumpy, long, orientation, perimeter length), and context (distance to otherfeatures, density, etc.).Size, PerimeterWhen looking at images, we segment objects by size as well: homes will be larger than trees, shoppingcenters larger than homes, and forest stands will be larger than individual patches of foliage and treeplantings. Deriving vectors from images, and segmenting them by size (shape area) allows to furtherdiscern between these features. Perimeter length of the vectors plays a role too, a although a house anda large tree may be about the same area in an aerial-derived vector, the tree will be more jagged andcurvy, and have a more complex shape and larger perimeter. Vector parameters can be derived fromtwo-dimensional (X, Y) or three-dimensional (X,Y,Z) calculations, for example, a 3-Dimensional sizing is
volume, which takes into account the height of derived features.Shape – CharacteristicsTo further continue on this path of dissecting the basic geometric characteristics of derived vectors, it isalso plausible to calculate the angles, or sum of angles in the vector to classify the shape. The number ofsides of the vector may be a clue to classiﬁcation as well, based on simple geometric deﬁnitions. A treevector will be jagged, have many sides, and many acute angles. The average of all angles in a tree will besmaller than a less jagged shape. Manmade structures will typically be less complex, with long sides onbuildings, and limited acute angles. Using calculations to highlight continuous lines such as buildingedges can help to reﬁne the vector during classiﬁcation, or to ouline features such as roads or powerlines.Shape, TypeA further elaboration on these concepts lies in shape-ﬁtting algorithms, that try to ﬁt shapes to pointclouds, iteratively. Shape-ﬁtting of rectangles is a technique being employed by researchers attemptingto classify point clouds and images into man-made structures such as buildings.10 Shape ﬁtting can bedone in two-dimensional (X, Y) or three-dimensional (X,Y,Z) space. Curvature (Perspective)
Contextual and Growing AlgorithmsContextual SegmentationA combination of characteristics derived from vectors and images can be used to translate commonhuman-interpreted patterns: “Homes are next to roads”, “Commercial buildings are at majorintersections”, “Islands are surrounded by water”, “Urban Features cluster around major highways androads”, “This tree species grows in dense patches, while this one grows in sparse distributions”. “Thesetypes of ﬂora grow within a distance of these plotted bird sightings (bringing in more data)”. Contextualanalysis can be applied to search for non-static environmental objects, based oﬀ of basic buﬀer andspatial analysis: “Military encampments will be within this distance from their origin, in this photo takenX days ago” but these calculations stray into the ﬁeld of time-sensitive data and predictive spatialanalysis, rather than mapping of the physical environment.Growing AlgorithmsUsing common image-processing algorithms termed “Region Growing” algorithms, which “grow” orexpand into pixels around a designated “seed” starting point based on characteristics of these pointshelps to determine boundaries of features as well. This process is less human-vision oriented as it is tobe deﬁnitive for computers. Determining locations of centers of homes in a residential area, and regiongrowing to the edge of the rooﬂine using high-resoultion imagery and LiDAR height elevations candelineate homes very well. This information can be contextual as well – region growing around a knownﬁre-starting point into a forest patch can delineate burn paths.WORKFLOWS AND APPLICATIONS CASES – DEPENDENT ON OUTPUTNEEDThe workﬂow or sequence of algorithms (and likely software packages) you use will vary based on thetype, and precision of classiﬁcation you are seeking. The simplest scenario: You are attempting to justclassify ground points in your cloud to create a surface model for ﬂood and stormwater mappingpurposes, all you will need is to calculate height extremities to single out the features, although it islikely the data vendor has already done this for you. More often than not, this is not the case, and has
led you to exploring the world of point and image processing.Examples:Workﬂow for classifying buildings and vegetation in a suburban residential areaClassifying buildings is a relatively challenging, but feasible process which depends on the degree ofaccuracy necessary. If your goal is to have an estimate of the number of buildings, simple locations ofbuildings are easy to determine. If your project is trying to model buildings, and requires deﬁnitions forfootprints, and roof overhangs, among other building components like facades, then you will bespending a lot more time trying to ﬁt shapes and other complex processes that haven’t been perfectedfor this purpose yet.nDSM CalculationThe focus of this approach is to classifying buildings in a non-dense, suburban area where visibility of thebuilding outlines is clear in the data. Basic techniques, such as obtaining an nDSM by subtracting theDEM from the DSM are used to outline all structures. The next steps deal with shape-basedclassiﬁcations of ﬁrst, raster surfaces and derivatives representing the interpolated nDSM, and then withvector-based ﬁne-tuning of the classiﬁcation.Curvatures and SlopeFirst, vegetation is highlighted using the Curvature approach. It is solidiﬁed by calculating slopes, andhighlighting areas that both, have a high slope and high curvature. Second, the vegetation curvaturederivatives can be buﬀered or region grown, to produce a surface to clip out of the nDSM and leavebehind the other structures, which will be mostly buildings but may be other structures like watertowers or power lines.Vector Calculations: Shape Area and Perimeter LengthNext, the vector-based ﬁne-tuning of our features continues. The remaining building shapes areconverted into vectors, and a range representing the high and low value for shape area, or the squarefootage of the house footprint is speciﬁed to remove smaller vectors (noise) and larger vectorsrepresenting other features). A perimeter threshold is applied as well to remove a large forest patch to
the north.Potential steps to ﬁne tune the building shapes:Simpliﬁcation of Tree and Building VectorsAt this point, rough vectors are generated for ourbuildings, and trees. We can either simplify the shapes,applying a smoothing to the vectors for trees, and aorthogonal / rectangular simpliﬁcation for the buildings.This may not be appropriate for all building shapes, ifthey are not rectangular.Region Growing from SeedsAn alternate approach at this step can be to calculate Segmentation using nDSM, Curvaturepolygon centroid for each vector, and use these points asseeds for region growing across a color, color-infrared, orfusion raster (raster with LiDAR intensity or heights burntin as a band). This can be done to get pixel shapes fortrees and houses which can be converted to vectors.Region growing can be useful for non-linear and circularbuilding shapes.Segmentations in dense urban areasDense Urban areas present a unique challenge becausethey are compacted, layered in mixed-use developments, Vector Segmentation (Forest Patchand often of unique shapes dreamed up by architects. above removed by Permieter and Area)Extremes in High / low building heights can have aﬀectson calculations and interpolations as well. Using regiongrowing, or spectral fusion of multi-band rasters with
interpolated point cloud metrics as bands can behelpful.Delineating roads with line ﬁtting ERDAS - Objectve Road Classiﬁcation ProcessRoad delineation is only semi-complex, and can bedone without using point clouds on just imagery. Apoint cloud could be used to interpolate heights inthe end, or can be used to create an “intensity”band from the LiDAR metrics with which to aidspectral segmentation of the roads.First, an image (potentially including intensity asone of the bands) is segmented using a classiﬁcationmethod from remote sensing techniques. Anappropriate segmentation can be a multi-resolutionsegmentation which computes statistics at diﬀerentscales between neighboring cells, or othersupervised classiﬁcations.Roads are identiﬁed by deﬁning spectral thresholdswhich determine a road in Red, Green, Blue,Infrared, and Intensity space. The pixels arecoverted to a raster, and a centerline is calculated.Classifying a vegetation and natural features withspectral informationVegetation can be much more precisely classiﬁed byincorporating LiDAR metrics with spectralinformation. Instead of merging derivative elevation Planar Fittingand intensity rasters into a BGR, IR image, theimage is burned onto the points if it is of
Context Based Island Detection (eCognition)appropriate resolution. Precise deﬁnition of various species of trees can be deﬁned and put in asthresholds by which to classify images.Classifying lakes and islands with contextual cluesContextual information is often the most neglected in common remote-sensing and point cloudsegmentation techniques, but it is arguably the closest to human vision and perception. When we lookat an image, we examine individual objects ﬁrst, but also examine the rest of the image to understandcontext. These interpretations are based on our conceived notions of typical distributions inenvironments. For instance, a green blotch in a lower resolution aerial of a residential area (ex. 5m)would not fool us to be a large tree considering factors in how it is situated in the context of otherhouses, and house orientation which we assume by the context of where the house is next to a road,and where the driveway is. Advanced contextual clues underlie some of the principles ofphotogrammetry, or the visual interpretation of images, and have potential to be translated into rules insoftware for image processing.CROSS-POLLINATION AND POTENTIAL W/ OTHER INDUSTRIES ANDDATASynthesis / expansion of remote sensingAdvances in image processing spurred by collection of high-resolution imagery and point cloudinformation are furthering the development of technologies in geospatial applications and remotesensing. In the past few years, major software vendors in the GIS (Geographic Information Systems)community including Integraph, ESRI, and Overwatch Industries have released point cloud processingapplications or amended the functionality of their software to support the .las (American Society of
Remote Sensing standard ﬁle format for LiDAR data). Major industry publications, including ImagingNotes (Remote Sensing) and ESRI ArcGIS Newsletter have published cover stories on the topic. There is arapidly growing demand for LiDAR and point cloud information in the geo-sphere, and it is bringingdemand for techniques and processes to segment point clouds into useful datasets and derivatives.Cross-pollination with roboticsThe world of robotics and computer vision has enjoyed a cult following of hobbyists and enthusiasts,along with intense research funding from DARPA for autonomous navigation of robots, and autonomousnavigating vehicles which utilize point cloud and environmental sensors to guide themselves through theenvironment and obstacles.11 Google is also pushing the initiative with their experimentation withautonomous navigating vehicles, as mentioned earlier, and drone technology is being adopted, whichcould beneﬁt greatly from being able to collect and process high-resolution imagery and range-ﬁndingdata from sensors.Improving human-interface devices and experiencesFrom the consumer marketing perspective – application of sensing technologies is going through majorrevolutions as well. Cell phones, such as the latest Google phone, incorporate face-scanningtechnologies as a password system.12 Gaming and Entertainment systems incorporate gesture andbody-based interaction, such as the Microsoft X-Box Kinect, and the Playstation Eye. Decoding humanposture and gait (way of walking) is part of criminal detection systems as well, ﬁnding criminals by theirwalking pattern and posture.13ROLE OF POINT CLOUD PROCESSES IN BIG DATAUses of Precise DataDerivatives from point cloud data and large datasets have huge potential applications and publicmanagement, environmental management, resource extraction, intelligence and defense technologies,consumer marketing, and social sciences. Derivatives such as building footprints, heights, and modelshave many applications as mentioned previously and in some of the case processing examples.Having large datasets available is one asset, having precise deliverables and information about these
datasets is another. Being able to process these large datasets into actionable information andinformative research will harness the amount of information currently being collected.Works Cited1 Hill, David. "The Rise of Data-Drive Intelligence." Network Computing. N.p., 05 Dec. 2012. Web. 14 Dec. 2012.2 "The New Media Reader [Hardcover]." Amazon.com: The New Media Reader (9780262232272): NoahWardrip-Fruin, Nick Montfort: Books. N.p., n.d. Web. 14 Dec. 2012.3 "2012 Year in Review: Big Data." Government Technology News. N.p., n.d. Web. 14 Dec. 2012.4 "Sebastian Thrun: Googles Driverless Car." TED: Ideas worth Spreading. N.p., n.d. Web. 14 Dec. 2012.5 "Autonomous Vehicles Now Legal in California." Wired.com. Conde Nast Digital, 23 Sept. 0012. Web. 14 Dec.2012.6 "Unmanned Aircraft Systems (UAS)." Unmanned Aircraft Systems (UAS). Federal Aviation Administration, n.d.Web. 14 Dec. 2012.7 "Types of LiDAR." ArcGIS Help 10.1. N.p., n.d. Web. 14 Dec. 2012.8 Schwallbe, Ellen, Haans-Gerd Maas, and Frank Seidel. "3D Building Model Generation from Airborne LaserScanner Data using 2D GIS data and Orthogonal Point cloud projections." ISPRS. Conference Proceedings.9 Morgan, Michael, and Ayman Habib. "Interpolation of LiDAR Data and Automated Building Extraction." Ohio StateUniversity - Department of Civil and Material Engineering (n.d.): n. pag. Web.10 Schwallbe, Ellen, Haans-Gerd Maas, and Frank Seidel. "3D Building Model Generation from Airborne LaserScanner Data using 2D GIS data and Orthogonal Point cloud projections." ISPRS. Conference Proceedings.11 "DARPA Urban Challenge 2008." Welcome. Defense Advanced Research Projects Agency, n.d. Web. 14 Dec. 2012.12 Wills, Danyll. "New Google Smart Phone Recognizes Your Face." MIT Technology Review. Massachusetts Instituteof Technology, 19 Oct. 2011. Web. 14 Dec. 2012.13 Greenemeier, Larry, and Scientiﬁc American. "Something in the Way You Move: Cameras May Soon RecognizeCriminals by Their Gait." PBS. PBS, 29 Sept. 2011. Web. 14 Dec. 2012.