Explications sur comment il est possible d'utiliser la puissance d'Hadoop pour analyser les vidéos des caméras présentent sur les réseaux routiers avec pour objectif d'identifier l'état du trafic, le type de véhicule en déplacement et même l'usurpation de plaques d'immatriculation.

  1. 1. Big Data and Intel® IntelligentSystems Solution for IntelligentTransportationXiao Dong Wang, Manager, Big Data Solution Team, IntelRobin Wang, Platform Solution Architect, IntelAlbert Hu, Solution Architect, IntelEMBS001
  2. 2. Agenda • Intelligent Transportation System (ITS) landscape in China • Blueprint for ITS • Big Data overview and benefit for ITS • Intel® Architecture based products for Big Data on ITS • ITS case study in China The PDF for this Session presentation is available from our Technical Session Catalog at the end of the day at: URL is on top of Session Agenda Pages in Pocket Guide2
  4. 4. China Environment “Sensing China” (IoT) strategy -- China Premier Wen Jiabao, ‘09 Government Objective • Government 12-5Y Plan/Social Harmony • Determine to lead in Internet of Things (IoT) ‒ Setting international standards for new technology ‒ $0.8Bn government funds Mega Trend Challenges & Government approach “The biggest development potential lies in the process of urbanization.” • Urbanization -- China New Premier Li Keqiang, ‘12 ‒ 690Mn now to 900Mn by ‘25 • IoT/Smart City is one way to solve the challenges ‒ IoT market size $80~120Bn by ‘15 ‒ 90+ smart cities plans underway ‒ Gaps: Core technology, standards, immature ecosystem, deployment model4
  5. 5. PRC Transportation Infrastructure Landscape 1 22011 Investment (Bn RMB) 1600 1400 Highway 1200 4,500,000km 1000 Total Number of Vehicles will exceed 800 200 Million by 2020 600 400 Railway 200 Waterway 120,000km 126,000km Urban Public 0 394,000km -200 -2% 0% 2% 4% 6% 8% 10% 2011-2015 CAGR (in terms of length) Infrastructure build-out trending to be stabilized. Key challenges due to large scale of the infrastructure network, growing number of vehicles, yet still higher traveler, vehicle/infrastructure density: Safety, Infrastructure/Traveler’s efficiency, Environment. 1 Source: China Ministry of Transportation’s 12th 5 year plan 2 Source: ISH* Research Report5
  6. 6. Big Data Source From Transportation Worldwide Enterprise and IP Storage used 0.3PB ~ 6.7PB/Day Video Data for Video Surveillance by End User (2016) generated for Smart City Environment Banking & Finance 2016 Casinos & Gaming 2% 3% City Surveillance 13% Commercial Education 37% 7% 3% Government 7% Industrial, Manufacturing, & Utilities Retail How to effectively 21% Transport 3% collect, aggregate, 4% Other manage and analyze data Source: IMS Enterprise and IP Storage used for Video Surveillance – World-2012 to help Intelligent Transportation System (ITS) application?6
  8. 8. Goals of Intelligent Transportation System • Traffic Management – Enforcing traffic regulations – Transportation planning support – Adaptive traffic control – Case investigation for police • Traveler information system – Real-time road condition  Speed & congestion  Historical camera images & statistics – Travel time information  Available to various terminals  Proactive travel plan • Commercial vehicle systems – Commercial vehicle management, tracking, administration • Public security – Video surveillance (remote video streaming & video searching)8
  9. 9. Intelligent Transportation System (ITS) End-to-End Solution NVR/DVR/ Hybrid NVR Decoder Collect, store, Data Center Edge transform, analyze Morphology Edge Video Server • Embedded Analytic and mine • Cloud service • Proprietary • High-performance Data Center Solutions Distributed Filesystem – HDFS* Terminal Device Abundant data visualization Distributed data analysis – Hadoop* Data analysis and cache Distributed real-time database – HBase*9
  10. 10. Intelligent Transportation System (ITS) Cross Region Deployment10
  12. 12. Scale Up or Scale Out Intelligent Transportation System (ITS) Data Burst Relational Database Distributional Database shift left shift right12
  13. 13. Intelligent Transportation System (ITS) Software Architecture MapReduce Online/Interactive Data Mining Offline analysis Applications HBase* Distributed Database for texts & images Sqoop* Data RDB Integration Aggregated results Legacy Applications13
  14. 14. Big Data Intel® Distribution for Apache Hadoop* Software Optimized Software Stack • Stable, enterprise-ready Hadoop* • Optimized for Intel® Architecture • Bring “Real-time” analysis to Hadoop by HBase* • Enhanced features to Hadoop for vertical enhancements segments Intel® Manager for Apache Hadoop Software 2.3 Deployment, Configuration, Monitoring, Alerting and Security Mahout* 0.7 R - statistics Hive* 0.9.0 Pig* 0.9.2 Oozie* 3.3.0 Sqoop* 1.4.1 RDB Data Collector Data Mining Data Manipulation Data Warehouse Data Manipulation Workflow Scheduler ZooKeeper* 3.4.5 MapReduce 1.0.3 Distributed Processing Framework Coordination HBase 0.94.1 Flume* 1.3.0 Log Data Collector Real-time Distributed Big Table HDFS* 1.0.3 Hadoop Distributed File System14
  15. 15. Intel® Distribution for Apache Hadoop* Software Enhancements for Intelligent Transportation System Enhancement Benefit for ITS Cross-site Big Table for HBase* • Data are stored in different region data center with a global virtual view • Each data center is the live backup to provide data access high availability SQL Layer on top of HBase • Real-time statistics on the big mount of traffic data • The interactive query and offline statistic share the same set of data Full-text indexing and near-real- • Provide the full text search capability on the time search for HBase structured data in distribution database system • Build in index make sure that the traffic data always synchronize with the index Efficiently Big Object Storage in • Increase the traffic image store performance HBase with the standard HBase interface R language statistics support to • Brings the mature R language library to the Hadoop* MapReduce, HDFS* and HBase • Reduce the effort to develop the complex data mining logic15
  17. 17. Value for Edge Analytics Video created Video analyzed Video Cold storage Video metadata stored Camera Video Storage (Edge or Private Cloud Public Cloud Centralized) Management Police System Car Video Edge Client Indexer/Analyzer/Transcoder Data Center/ Cloud Data Services (Video capture) (Image extraction & Metadata Creation) (Private/Public) (VSaaS, VAaaS) Smart Checkpoint District City  Province  PRC 1 Edge VA’s Value By end of 2017 By end of 2017 • Real-time intelligence (into metadata) 76 PB 457 PB raw video 1/6 of video data Metadata for traffic • Reduce the footage to be 1/8 ~ 1/12 Per Day generated per day of its original size • Resolve the bandwidth issue and People, cars, backend storage capacity constrain license plates 1 Source: Internal Team Analysis based on IHS* Research Report17
  18. 18. Enhanced NVR (Network Video Recorder) Key Features for Intelligent Transportation System (ITS) 闯红灯// Run the 车牌颜色识别 // Plate red light colour recognition 逆行 // 车身颜色识别 // Vehicle Retrogradation colour recognition 车牌识别 // Plate 交通拥堵 // Traffic jam Recognition 车流统计// Vehicle 行使缓慢 // Run slowly counting 禁停// No parking 行使超速 // Speeding 禁左禁右 // No turn 行人横穿//Jaywalk Crossing vehicle right or left capture 占道(不按规定车道行使) 车标识别//Auto logo recognition 变道 // Lane 机动车抓拍 // Change Abnormality quick shot 压线 // Line 超速 // Speeding crossing Vehicle features recognition18
  19. 19. Intel® Architecture Base NVR Intel® Xeon® Processor E3 * Family, Intel® C216 Chipset -Up to 32G DDR3 Memory -4 x 10M/100M/1000M Base-T LAN -16 x SATA3.0 or 24 x SATA3.0 -2 x MSATA; -3U or 4U rack-mounted Chassis 3rd Generation Intel® * Core™ Processor Family - 4 x 8G DDR3 Memory - 2 x 10M/100M/1000M Base-T LAN - 8 x SATA3.0 or 16 x SATA3.0 - 1 x MSATA - 2U or 3U rack-mounted Chassis Intel® Atom™ Processor D2550, Intel® NM10 Express Chipset - 1 x 4G DDR3 Memory * - 2 x 100M/1000M LAN - 8 x SATA3.0; - 1 x MSATA; - 2U rack-mounted Chassis19
  20. 20. Big Data Appliance Reference Intel® Server Board Design from Intel S2600GZ “Grizzly Pass” Intel® Server System R2000 “Big Horn Peak” HDFS* Data Node • Large Storage Capacity • Large Memory Capacity • Extreme Power Efficiency • Extreme FDR InfiniBand • Extensive I/O • Optional SSD or PCI Express* SSD InfiniBand* and Ethernet Switches Intel Server Board S2600JF “Jefferson Pass” Intel Server System H2000 “Bobcat Peak” HDFS Name Node • High Density Form Factor • High Memory Bandwidth • Extreme FDR InfiniBand • Optional SSD or PCI Express SSD Can be data node for compute-intensive Big Data applications20
  21. 21. Big Data Appliance Reference Design: Turnkey Platforms for ISV/SI/LOEM Easy to Use Quality • Easy to deploy • Integrated validation of all • Easy to scale-out components • Easy to manage • OS and device drivers • Rapid deployment in days • Big Data software packages • Quickly isolate root cause • BIOS, firmware, etc. between appliance and • Embedded acceptance test application • Disk health monitoring Power Efficiency Performance • Spread core design • 10GbE, InfiniBand* • Cold Redundant Power Supply • Advanced storage controller • Intelligent disk spin-up/off • SSD and PCI Express* SSD • ACPI* S3/S4 support • SW tuning: block size, # of • DCM integrated at rack reducers, etc. Big Data ISV/SI/LOEM all look forward to a total solution21
  22. 22. Big Data Appliances Reference Design from Intel® Performance & Power Advantages • 10GbE & InfiniBand* FDR 80 PLUS 230V Internal Certification Redundant • Network protocol advances % of Rated Load 20% 50% 100% • SSD, PCI Express* SSD, Hybrid 80 PLUS Bronze 81% 85% 81% 80 PLUS Silver 85% 89% 85% • Advanced storage controller 80 PLUS Gold 88% 92% 88% • Balance oriented optimization 80 PLUS Platinum 90% 94% 91% Low Power Technology Sources CRPS (5-10% up) Power Supply: 80 PLUS Platinum HW Power Supply: “cold” redundancy HW Utilization (%) Spread-core server board layout HW Normal ACPI S3 support HW-SW PSU ACPI S4 support HW-SW 1 module 2 modules Staggered disk spin up HW-FW works work Intelligent disk spin off control HW-SW Load (%) Data center, rack, and node level Threshold (40%) HW-FW-SW power monitoring and limiting Cold Redundancy Power Supply (CRPS)22
  24. 24. Case Study: Intelligent Monitoring and Recording System Traffic Flow Analysis Vehicle History Hive* Behavior Edge Video Analytic MapReduce (Enhanced NVR) ETL 3G HBase* Real-time Vehicle HDFS* License Analysis24
  25. 25. Case Study: Intelligent Transportation System (ITS) Solution Traffic Management • Real-time road conditions report • Over speed vehicles detection in road segment • Fake plate number detection Public Security • Tracking vehicle in real-time • Alerts and alarms based on blacklist Traveler Guide • Real-time road condition by getting latest camera images and traffic flow statistics • Travel time estimation for road segments in the city25
  26. 26. Case Study: Intelligent Transportation System (ITS) Result Illegal vehicle tracking efficiency 违法车辆追踪效率提升 Through the massive data real-time analysis function, the 通过海量数据实时分析处理功能能将违法车辆数据定位时间由小时 illegal vehicle location data time is by the hour Level 级缩减为分钟级甚至秒级 reduced to minutes or even seconds. Deaths in bad traffic accidents 恶性交通事故死亡人数减少 Through the floating vehicle monitoring system to collect vehicle 通过浮动车监控系统收集车辆信息并且实时分析,能够对事故高发车 information and real-time analysis to monitor high Service Vehicles 辆(如工程货车)进行行为监控,降低恶性事故率。 (such as engineering truck ) behavior and reduce the accident rate. Road congestion rate 道路拥堵率下降 通过路况监控设备收集路况信息并实时处理,能够精确绘制道路拥堵线 Through the traffic monitoring equipment to collect traffic information and real-time processing, the road congestion coil can be drawn 圈,提供交管部门快速处理突发事故,并提供给大众平台供驾驶员参考 accurately, emergency can be routed to traffic management 从而疏导车流 departments rapidly and traffic drivers can be diverted accordingly.26
  27. 27. Summary • Intelligent Transportation System (ITS) is Intel® global focus now and future • Intel®’s end-to-end analytics architecture fits ITS solution development • Intel® has rich resources to help developers for ITS related application development27
  32. 32. Intelligent Transportation System (ITS) 200 benefits from Interactive Hive Query 159 150 100 million records 98 over a 8-node cluster 100 68 63 50 28 Hive* 0.9.0 (M/R) (sec) 18 0.2 0.2 Interactive Hive (sec) 0 Query 1 Query 2 Query 3 Query 4 User Scenario Query Calculate each day’s internet traffic of a specific SELECT sum(down+up) FROM cdr201209 WHERE user number = 13300000000 GROUP BY day; Get the 10 most heavily called numbers for a SELECT TOP(10) tonumber, sum(call_length) len FROM specific user cdr_201209 WHERE number = 13300032810 GROUP BY tonumber ORDER BY len DESC Get the top 1000 call length from all user phone SELECT TOP(1000) number, call_length FROM calls cdr_201209 ORDER BY call_length DESC Get the top 1000 users having highest total SELECT TOP(1000) number, sum(fee) f FROM monthly charge cdr_201209 GROUP BY number order by f DESC Intel® Distribution for Apache Hadoop*32 Software Enhancement
  33. 33. Intelligent Transportation System (ITS) benefits form Cross-site Big Table Two deployment models: 1. Global Table View 1. In transportation system, 2. Data are physically stored each district has a DC, one in geo-distributed data can connect to any DC and centers view all of the data 3. Higher availability 2. In banks, provincial branch Data Center 4. Better locality has its own DC. Central bank A can view all of the data, but 5. Distributed aggregation branches can not see each removes data transfer other. Virtual Big Table Data Center C Data Center B Async Replication Intel® Distribution for Apache Hadoop*33 Software Enhancement
  34. 34. Intelligent Transportation System (ITS) Benefits From HBase* Big Object Storage Insertion Performance(Single Client, No pre-split) Insert performance increase 250 records/second Insertion Performance(500KB/record) 200%,insert latency 200 reduces 90% 150 100 50 * hbase(no presplit) hbase lob 0 0s 100s 200s 300s 400s 500s Insertion Delay(Pre-split 32 regions, 6 Client Nodes) Delay: 120 Insertion hbase*lob delay/s Test setup: (intel-01 cluster, 6 machines, E5- 100 hbase delay/s 2620, 24core, 48G memory). 80 No client cache, No WAL. For HBase* (no split), after insertion, the region count is 20. 60 40 20 0 0s 200s 400s 600s 800s 1000s 1200s Intel® Distribution for Apache Hadoop*34 Software Enhancement