View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
fic environment; vehicles generate heat too. Even the pavement can appear hotter on a summer day than a pedestrian’s body. So, rather than offering the solution for pedes- trian detection per se, infrared sensors pro- vide a means to simplify the segmentation problem. Pattern recognition techniques are still necessary. Active-sensor approaches Video sensors do not directly provide depth information; stereo vision derives depth by establishing feature correspondence and performing triangulation. On the other hand, active sensors measure distances directly.Figure 1. A typical dangerous situation: a child suddenly steps into a street. Radar Some commercial vehicles already employ radar for adaptive cruise control (for example, the Distronic System on Mercedes-appearances, and the cluttered (uncon- Mohan and his colleagues have extended Benz S-Class cars). For near-distance appli-trolled) backgrounds. Most research on this research to involve a component-based cations, such as pedestrian detection, ongo-vision-based pedestrian recognition has approach.11 ing investigations focus on 24-GHz radartaken a learning-based approach, bypassing However, this approach’s performance– technology.14 Radar-based systems cana pose recovery step altogether and de- speed trade-off is currently unfavorable enhance object localization by placing multi-scribing human appearance in terms of for use in vehicles. The Chamfer System ple sensors on the vehicle’s relevant partssimple low-level features from a region of addresses this through two-step object recog- and applying triangulation-based techniques.interest (ROI). One line of research has nition.12 The first step applies hierarchical They can classify objects—that is, distin-dealt specifically with scenes involving template matching using contour features to guish pedestrians from other objects such aspeople walking laterally to the viewing efficiently lock onto candidate solutions. cars and trees—by examining the powerdirection, with recognition by either using Matching is based on correlation with dis- spectral-density plot of the reflected signals.the periodicity cue2,3 or learning the char- tance-transformed images. By capturing the In this context, we consider an object’s spec-acteristic lateral gait pattern.4 object’s shape variability through a template tral content and reflectivity. Objects with A crucial factor determining the suc- hierarchy and by using a combined coarse- smaller spatial extents, such as pedestrians,cess of learning methods is the availabil- to-fine approach in shape and parameter have narrower peaks in the plot than, say,ity of a good foreground region. Unlike space, this step achieves large speedups cars. The material properties of the object’swith applications such as surveillance, compared to an equivalent brute-force surface determine the strength of reflectedwhere the camera is stationary, standard method. The second step reverts to texture- radar signals. Vehicles’ metallic parts reflectbackground subtraction techniques are of based pattern classification of the candidate much better than human tissue, by at least anlittle avail here because of the moving solutions that the first step provided. order of magnitude. Human tissue, in turn,camera. Independent motion detection Another powerful technique to establish reflects much better than nonconductivetechniques can help,3 but they are diffi- ROIs is stereo vision. Uwe Franke and his materials, such as the wood in trees.cult to develop. Yet, given a correct initial colleagues combine stereo vision with tex-foreground, we can shift some of the bur- ture-based pattern classification. I describe Laser range findersden to tracking.4–9 two other stereo vision-based approaches The main appeal of eye-safe laser range A complementary problem is to recog- later. finders lies in their fast, precise depth mea-nize pedestrians in single images; this is Lately, interest has been increasing in surement and their large field of view. Forparticularly relevant for pedestrians stand- video sensors that operate outside the visi- example, Martin Kunert, Ulrich Lages, anding still. One general approach involves ble spectrum. Having long been used ex- I describe a laser range finder that has ashifting windows of various sizes over the clusively in the military domain, infrared depth accuracy of +/− 5 cm and a range ofimage, extracting low-level texture fea- sensors are finding their way into civilian 40 m for objects with at least 5 percenttures, and using standard pattern classifi- applications owing to the advent of cheaper, reflectivity (this includes most, if not all,cation techniques to determine a pedes- uncooled cameras. The principle of detect- relevant targets).14 Furthermore, its hori-trian’s presence. For example, Constantine ing pedestrians by the heat their bodies emit zontal scans cover a 180-degree field ofPapageorgiou and Tomaso Poggio com- is appealing (Takayuki Tsuji and his col- view in increments of 0.5 degree at 20 Hz,bine wavelet features with a support vector leagues provide one example13). Yet pedes- making the sensor especially suitable tomachine classifier.10 More recently, Anuj trians are not the only heat sources in a traf- cover the area just in front of the vehicle.78 computer.org/intelligent IEEE INTELLIGENT SYSTEMS
Current systems At least three pedestrian recognitionsystems have been integrated on demon-stration vehicles. Those I describe here arevideo-based and employ a two-step detec-tion–verification framework for efficientpedestrian recognition; stereo vision pro-vides the ROI. At Carnegie Mellon University’s NavLab,Liang Zhao and Charles Thorpe developeda system that combines stereo vision withneural-network pattern classification.15 Itobtains the texture features for classifica-tion by applying a high-pass filter to theROI and normalizing for size. The system,running at 3 to 12 Hz, aims to assist bus Figure 2. DaimlerChrysler’s Urban Traffic Assistant demonstrator.drivers in urban traffic. The researchersplan to expand it to cover the sides of the busand, eventually, to provide full 360-degree tive for pedestrian protection under the succession of three components: stereo-coverage. Fifth Framework project Protector.14,20 based obstacle detection, template-based The University of Pavia system, imple- The project brings together major vehicle shape matching, and texture-based patternmented in the ARGO experimental auto- manufacturers, sensor suppliers, and re- classification. Assume that each compo-nomous vehicle, combines stereo vision search institutions to develop intelligent nent’s performance is independent of thatwith template matching for detecting pe- systems on vehicles for reducing accidents of the others.destrian head and shoulder shapes.16 The involving pedestrians, bicyclists, and other We conservatively estimate that, tosystem searches for vertical symmetry to unprotected traffic participants. Among the detect every pedestrian in urban traffic, theverify candidate regions. The authors re- completed tasks are the analysis of acci- stereo component produces one pedestrianport good detection results in the range of dent statistics and the definition of relevant ROI each 10 seconds. (In lieu of hard10 to 40 meters. traffic scenarios. The project is investigat- experimental data, we use a value derived At DaimlerChrysler, we have been work- ing three sensor technologies: radar, laser from our experience.) We assume that theing on pedestrian recognition as part of our range finder, and video, which we will im- stereo component accomplishes this bymultiyear effort to extend driver assistance plement on two passenger cars (Fiat and employing simple heuristics regarding thebeyond the highway scenario into the com- DaimlerChrysler) and one truck (MAN). sizes and locations of the rectangularplex urban environment.4,12,17,18 Of par- Sometime in 2002 we will evaluate the final regions it detects as obstacles. Because weticular interest is the Intelligent Stop&Go systems on a test track under standardized cannot expect the pedestrian ROI to exactlysystem on our Urban Traffic Assistant and realistic conditions (that is, using dum- outline the pedestrian, we assume that wedemonstrator (see Figure 2). Intelligent mies). User interface and user acceptance need 10 probes to extract the pedestrianStop&Go lets the UTA autonomously fol- studies will conclude this project. correctly. For the shape-based and texture-low a lead vehicle, while being aware of based components, we estimate a detectionrelevant elements of the traffic infrastruc- The road ahead rate of 95 percent at a false positive rate inture (for example, road lanes, traffic A pedestrian safety system’s success or the order of 10–3 and 10–1 per candidatesigns, and traffic lights) and other traffic failure, from a technical viewpoint, will region, respectively.10,12,15 All in all, weparticipants. depend largely on the rate of correct detec- arrive, in this best-case scenario, at a false- Our most recent pedestrian detection sys- tions versus false alarms that it produces, at a positive rate of 1 per 104 seconds or 1 pertem consists of stereo vision-based obstacle certain processing rate and on a particular 2.8 hours, for a detection rate of 90 percent.detection and fine localization within the processor platform. But what rate will we Integrating the results over time by track-stereo ROI using the Chamfer System (see need for actual deployment of a sensor-based ing will improve this figure somewhat.Figure 3).12 The system tracks detected pedestrian system? This question However, this improvement will be offset byobjects over time and aggregates single- is difficult to answer because the desired rate the lower filter ratios of the shape and tex-frame results. At the same time, a time delay will depend on the final system concept. If, ture components, which, in practice, are notneural network with local receptive fields19 for example, the system concept involves independent. On the basis of this, we canconstantly evaluates successive ROIs, search- only a warning function, performance crite- fairly say that we’ll need to reduce the false-ing for the characteristic temporal patterns ria will likely be less stringent than for a con- positive rate by at least one order of magni-of (lateral) human gait. Visit www.gavrila. cept that involves active vehicle control. tude to obtain a viable pedestrian system,net/Computer_Vision/computer_vision.html Perhaps we can more easily establish while maintaining the same detection rate.for a few video clips. where we currently stand regarding perfor- Fortunately, several ways exist to signifi- Other systems will soon join these three. mance. Consider a (fictional) video-based cantly reduce the false-positive rate. Im-The EU has recently begun a major initia- pedestrian detection system that involves a proved multicue video algorithms (combin-NOVEMBER/DECEMBER 2001 computer.org/intelligent 79
the precrash range, prediction quickly be- comes unreliable; pedestrians can easily change direction. Furthermore, accurate risk assessment will increasingly require good scene understanding. For example, the dan- ger associated with a pedestrian heading toward the street will depend largely on the placement of the road boundaries, whether a traffic light exists, and, if so, whether it is green. This suggests that, in the long run, a reliable, anticipatory pedestrian system must be aware of several types of infrastructural elements, through either perception or telem- atics approaches. We might reduce at least some complexity by limiting a pedestrian protection system’s scope to cover only spe- cific traffic scenarios; this will represent a good intermediate solution. D ifficult technical challenges lie ahead, but this domain’s progress over the past few years warrants optimism. Consider- ing the potential for saving lives and in- creasing safety, the goal certainly appears worthwhile. References 1. D.M. Gavrila, “The Visual Analysis of Human Movement: A Survey,” Computer Vision and Image Understanding, vol. 73, no. 1, Jan. 1999, pp. 82–98. 2. R. Cutler and L. Davis, “Real-Time Periodic Motion Detection, Analysis and Applications,” Proc. IEEE Conf. Computer Vision and Pat- tern Recognition, vol. 2, IEEE CS Press, LosFigure 3. Pedestrian detection results (shown in white) from the Chamfer System. Alamitos, Calif., 1999, pp. 326–331.Besides showing correct detections, the figure illustrates typical shortcomings, such as 3. R. Polana and R. Nelson, “Low Level Recog-false detections in heavily textured image areas (for example, the left image in the nition of Human Motion,” Proc. IEEE Work-bottom row) or missing detections in areas of low contrast, occlusion, or both (for shop Motion of Non-rigid and Articulatedexample, the right image in the bottom row). Objects, IEEE CS Press, Los Alamitos, Calif., 1994, pp. 77–82.ing distance, shape, texture, and motion pedestrian protection devices, pedestrian 4. B. Heisele and C. Wöhler, “Motion-Basedcues) could successively decimate the false safety systems could piggyback on the per- Recognition of Pedestrians,” Proc. 14th Int’l Conf. Pattern Recognition, IEEE CS Press,alarm rate, as the description of our fictional vasiveness of the future communication Los Alamitos, Calif., 1998, pp. 1325–1330.system illustrates. Sensor fusion (for exam- infrastructure (for example, the UMTSple, combining video and laser range finder [Universal Mobile Telecommunications 5. A. Baumberg and D. Hogg, “Learning Flexi-approaches) will probably also produce System] and Bluetooth). ble Models from Image Sequences,” Proc. European Conf. Computer Vision, Lecturelarge benefits. Finally, telematics concepts, Challenges remain even after we solve the Notes in Computer Science, vol. 800, Springer-involving communication between pedestri- pedestrian detection problem. After all, we’ll Verlag, Heidelberg, 1994, pp. 299–308.ans and vehicles combined with GPS-based need to assess the danger of a particular traf- 6. T. Cootes et al., “Active Shape Models: Theirlocalization, could close any remaining per- fic situation. This assessment will consider Training and Applications,” Computer Visionformance gap. Although we can’t realisti- the pedestrians’ and vehicles’ position and and Image Understanding, vol. 61, no. 1, Jan.cally expect people to buy special-purpose speed. But with a larger look ahead, beyond 1995, pp. 38–59.80 computer.org/intelligent IEEE INTELLIGENT SYSTEMS
Dariu M. Gavrila is a research scientist with DaimlerChrysler Re- search’s Image Understanding Group in Ulm, Germany. His research interests include vision systems for detecting human presence and activity, with applications in surveillance, virtual reality, and intelli- 7. C. Curio et al., “Walking Pedestrian Recogni- gent human–machine interfaces. He works on real-time vision sys- tion,” IEEE Trans. Intelligent Transportation tems for driver assistance and intelligent cruise control. He is cur- Systems, vol. 1, no. 3, Nov. 2000, pp. 155–163. rently responsible for the European Union’s Protector project for pedestrian protection. He received his MS in computer science cum 8. V. Philomin, R. Duraiswami, and L. Davis, laude from the Free University in Amsterdam and his PhD in com- “Quasi-random Sampling for Condensation,” puter science from the University of Maryland at College Park. Contact him at Image Under- Proc. European Conf. Computer Vision, vol. standing Systems, DaimlerChrysler Research, Ulm 89081, Germany; dariu.gavrila@daimlerchrysler. 2, Lecture Notes in Computer Science, vol. com; www.gavrila.net. 1843, Springer-Verlag, Heidelberg, Germany, 2000, pp. 134–149. 9. G. Rigoll, B. Winterstein, and S. Müller, “Robust Person Tracking in Real Scenarios with Non-stationary Background Using a Sta- Vehicles, IEEE Press, Piscataway, N.J., 2001, Technologies, L. Vlacic, F. Harashima, and M. tistical Computer Vision Approach,” Proc. 2nd pp. 133–140. Parent, eds., Butterworth Heinemann, Oxford, IEEE Int’l Workshop Visual Surveillance, UK, 2001, pp. 131–188. IEEE CS Press, Los Alamitos, Calif., 1999, 14. D.M. Gavrila, M. Kunert, and U. Lages, “A pp. 41–47. Multi-sensor Approach for the Protection of 18. U. Franke et al., “Autonomous Driving Goes Vulnerable Traffic Participants: The PRO- Downtown,” IEEE Intelligent Systems, vol.10. C. Papageorgiou and T. Poggio, “A Trainable TECTOR Project,” Proc. IEEE Instrumenta- 13, no. 6, Nov./Dec. 1998, pp. 40–48. System for Object Detection,” Int’l J. Computer tion and Measurement Technology Conf., vol. Vision, vol. 38, no. 1, June 2000, pp. 15–33. 3, IEEE Press, Piscataway, N.J., 2001, pp. 19. C. Wöhler and J. Anlauf, “An Adaptable 2044–2048. Time-Delay Neural-Network Algorithm for11. A. Mohan, C. Papageorgiou, and T. Poggio, Image Sequence Analysis,” IEEE Trans. “Example-Based Object Detection in Images 15. L. Zhao and C. Thorpe, “Stereo- and Neural Neural Networks, vol. 10, no. 6, Nov. 1999, by Components,” IEEE Trans. Pattern Analy- Network-Based Pedestrian Detection,” IEEE pp. 1531–1536. sis and Machine Intelligence, vol. 23, no. 4, Trans. Intelligent Transportation Systems, Apr. 2001, pp. 349–361. 20. P. Carrea and G. Sala, “Short Range Area vol. 1, no. 3, Nov. 2000, pp. 148–154. Monitoring for Pre-crash and Pedestrian Pro-12. D.M. Gavrila, “Pedestrian Detection from a 16. A. Broggi et al., “Shape-Based Pedestrian tection: The Chameleon and Protector Pro- Moving Vehicle,” Proc. European Conf. Com- Detection,” Proc. IEEE Intelligent Vehicles jects,” Proc. 9th Aachener Colloquium Auto- puter Vision, vol. 2, Lecture Notes in Com- Symp., IEEE Press, Piscataway, N.J., 2000, mobile and Engine Technology, Institut für puter Science, vol. 1843, Springer-Verlag, pp. 215–220. Kraftfahrwesen Aachen (Aachen Inst. for Heidelberg, Germany, 2000, pp. 37–49. Automotive Eng.) and Verbrennungs Kraft- 17. U. Franke et al., “From Door to Door: Princi- maschinen Aachen (Aachen Inst. for Internal13. T. Tsuji et al., “Development of Night Vision ples and Applications of Computer Vision for Combustion Engines), Aachen, Germany, System,” Proc. IEEE Int’l Conf. Intelligent Driver Assistant Systems,” Intelligent Vehicle 2000, pp. 629–639. Advertiser/Product Index November/December 2001 Advertising Sales Offices Page No. Computing in Science & Engineering Cover 3 Sandy Brown 10662 Los Vaqueros Circle, Los Alamitos, CA IEEE Computer Society 60 90720-1314; phone +1 714 821 8380; fax +1 714 821 IEEE Distributed Systems Online 33 4010; email@example.com. IEEE Intelligent Systems Cover 4 IEEE Pervasive Computing 40 Advertising Contact: Debbie Sims, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720-1314; Classified Advertising 60 phone +1 714 821 8380; fax +1 714 821 4010; firstname.lastname@example.org. Boldface denotes advertisers in this issue. For production information, and conference and classified advertising, contact Debbie Sims, IEEE Intelligent Systems, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720-1314; phone (714) 821-8380; fax (714) 821-4010; email@example.com; http://computer.org.NOVEMBER/DECEMBER 2001 computer.org/intelligent 81