Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Harris Gasparakis, AMD

4,467 views

Published on

Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Harris Gasparakis, AMD, at the Embedded Vision Alliance Summit, May 2014.

Harris Gasparakis, Ph.D., is AMD’s OpenCV manager. In addition to enhancing OpenCV with OpenCL acceleration, he is engaged in AMD’s Computer Vision strategic planning, ISVs, and AMD Ventures engagements, including technical leadership and oversight in the AMD Gesture product line. He holds a Ph.D. in theoretical high energy physics from YITP at SUNYSB. He is credited with enabling real-time volumetric visualization and analysis in Radiology Information Systems (Terarecon), including the first commercially available virtual colonoscopy system (Vital Images). He was responsible for cutting edge medical technology (Biosense Webster, Stereotaxis, Boston Scientific), incorporating image and signal processing with AI and robotic control.

Published in: Technology
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download Full EPUB Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download Full doc Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download PDF EBOOK here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download EPUB Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download doc Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Harris Gasparakis, AMD

  1. 1. Copyright © 2014 AMD 1 Dr. Harris Gasparakis 5/29/2014 Computer Vision Powered by Heterogeneous System Architecture (HSA)
  2. 2. Copyright © 2014 AMD 2 • DEVELOPING EMBEDDED VISION APPLICATIONS: THE PROPRIETARY API LEGACY. • THE RISE OF GPUS: DENSE DATA PARALLELISM AND CACHE-COHERENT SIMD • DO WE STILL NEED CPUs? • THE HETEROGENEOUS FUTURE OF VISION: OPENCL™, HSA • OPENCL EVOLUTION • OPENCL 1.X • OPENCL 2.X AND HSA • THE OPENCL EXECUTION MODEL • MONSTERS IN THE ORCHESTRA, CES 2014 • PUTTING OPENCL™ 2.0 TO WORK • OPENCL™ IN OPENCV • OPENCV 3.0: THE TRANSPARENT API • HOW DOES IT WORK? • CONCLUDING THOUGHTS AGENDA
  3. 3. Copyright © 2014 AMD 3 • Choose HW and SW platform – A multitude of devices of different capabilities and strengths! – A multitude of algorithms of different requirements! Data Parallel (Y/N/M?) • Highly specialized programmers • Non-portable programs, with high platform risk Is Image Processing/ Vision your core IP? Use somebody’s SDK or end product No Yes This talk is not for you Developing Embedded Vision Applications
  4. 4. Copyright © 2014 AMD 4 Sobel Just SIMD is NOT good enough for good GPU acceleration, contrary to popular wisdom Merge kernels Split kernels • “GPU is great for SIMD: Single Instruction Multiple Data” • Image Processing = Dense Data Parallelism • Same calculation (e.g. calculate edge strength) for all pixels. • Adjacent threads load adjacent “enough” data 1. Too simple algorithms (non enough math per memory transfer) 2. Too much complexity per kernel (high register pressure) The rise of GPUs: Dense Data Parallelism
  5. 5. Copyright © 2014 AMD 5 Features • Extensive set of CPU libraries • Several approaches to vision (image understanding) can be thought of as a “dense to sparse transition” • Sparsity is not a GPU’s friend. • OpenCL 2.0 solves this problem much more optimally (and with less code) Do we still need CPUs?
  6. 6. Copyright © 2014 AMD 6 CPU GPU Audio Processor ISP: Image Signal Processing Fixed Function Acctr Encode Decode SharedMemory DSP Other! The Heterogeneous Future of Vision THE RIGHT IP FOR THE RIGHT TASK! • CPU is great for serial tasks • Lower latency • Good branching performance • Lower throughput • Good at Task parallelism • Better for Sparse Data Parallelism • GPU excels at data parallel problems • High throughput • Possibly High latency • Good at “Dense Data Parallelism” • Increasingly better at task parallelism (concurrent kernel execution, and OpenCL 2.0 Dynamic parallelism) An efficient Heterogeneous System Architecture would be optimal (e.g. GFLOPS/$/W)
  7. 7. Copyright © 2014 AMD 7 © Copyright 201 HSA Foundation. All Rights Reserved.7 Founders Promoters Supporters Contributors Academic HSA FOUNDATION
  8. 8. Copyright © 2014 AMD 8 • OpenCLTM: Khronos Software API • Cross-platform (Windows, Linux, Mac OS, etc.) • Multi-vendor (AMD, Apple, IBM, Intel, NVIDIA, etc.), with maturing support • Multicore CPU, discrete GPU, integrated GPU (aka APU), DSP, FPGA, etc. • HSA: Heterogeneous System Architecture, an industry standard specification • OpenCL 2.0 introduces HSA features • Open Source also helps! • OpenCV, featuring OpenCL acceleration Open Standards
  9. 9. Copyright © 2014 AMD 9 OpenCL™ Evolution: Discrete GPU OpenCL was invented as an open standards high level API for GPU compute, first on discrete graphics cards OpenCL abstracts: • Data management across multiple memory spaces • Memory buffers / Images • Compute Instructions • “Kernels” • Execution on “compute units” CU. PCIe ™ CU CU CU CU CU CU CU CU CU CU CU GPU device Memory GPU Main memory Host Memory PCIe Memory (pinned) CPU …
  10. 10. Copyright © 2014 AMD 10 OpenCL™ Evolution: Legacy APU CU CU … CU CU …CU CU CU CU GPU  APU: Physical Integration: CPU and GPU on same die  OpenCL (1.x) works also on APUs • Device memory is (part of) main memory, but still must use memory buffers! Host memory Device Visible Host Memory Device memory Host Visible Device Memory Main memory CPU …
  11. 11. Copyright © 2014 AMD 11 OpenCL™ Evolution: HSA Enabled APU  Unified Coherent Memory enables data sharing across all processors and GPU compute units  OpenCL™ 2.x: No need to use memory buffers, just use data pointers, just like you would do on the CPU. Unified (Bidirectionally Coherent, pageable) Virtual Memory CU CU … CU CU … CU CU CU CU GPUCPU Cache Cache Physical Memory
  12. 12. Copyright © 2014 AMD 12 A 360o x 90o immersive gesture-enabled experience, enabled by OpenCL (and OpenCV) Monsters in the Orchestra, AMD CES 2014
  13. 13. Copyright © 2014 AMD 13 VISION: Dense to sparse transition Putting OPENCL™ 2.0 to Work GPU keypoints Data changed by a kernel, can be visible by CPU, before kernel returns requires (fine grain SVM). CPU consumes keypoints “as they come”, and updates a “shape model”
  14. 14. Copyright © 2014 AMD 14 Device Setup Compile Kernels Allocate Memory Further Processing Clean up Other Tasks… Host Memory Transfer (discrete GPU) Or Zero Copy (APU OpenCL 1.2) Or Shared Virtual Memory (APU OpenCL 2.0) Kernel 1 Kernel 2 Kernel 2_1 Kernel 2_2 Kernel Compute Device New in OpenCL 2.0: Dynamic parallelism! A kernel can enqueue another kernel The OpenCL Execution Model
  15. 15. Copyright © 2014 AMD 15 Open Computer Vision Library 2,500+ algorithms and functions Cross-platform BSD license High performance Professionally developed 7M+ downloads • OpenCL is fully integrated in OpenCV • ~100 most commonly used algorithms optimized with OpenCL • Can be built without OpenCL SDK installed. Dynamic OpenCL runtime loading • OpenCL enabled on the official Windows bin pack • OpenCV pre-commit check includes OpenCL tests • Very easy to plug in your own kernels using OpenCV plumbing • In 2.4.x, OpenCL acceleration is a distinct code path First public release 2000 2013 ~10/2014 v2.0 C++ API v2.4.3 OpenCL™ 2009 v3.0 alpha Transparent API
  16. 16. Copyright © 2014 AMD 16 // initialization VideoCapture vcap(...); CascadeClassifier fd("haar_ff.xml"); Mat frame, frameGray; vector<Rect> faces; for(;;){ vcap >> frame; cvtColor(frame, frameGray, BGR2GRAY); equalizeHist(frameGray, frameGray); fd.detectMultiScale(frameGr ay, faces); } OCV 2.4: Face detect on CPU // initialization VideoCapture vcap(...); ocl::OclCascadeClassifier fd("haar_ff.xml"); ocl::oclMat frame, frameGray; Mat frameCpu; vector<Rect> faces; for(;;){ vcap >> frameCpu; frame = frameCpu; ocl:: cvtColor(frame, frameGray, BGR2GRAY); ocl:: equalizeHist(frameGray, frameGray); ocl:: fd.detectMultiScale(frameGray, faces); } OCV 2.4: Face detect using OpenCL™ OpenCV 2.4: Similar, but not identical code paths. You will need to write code explicitly for both CPU and OpenCL // initialization VideoCapture vcap(...); CascadeClassifier fd("haar_ff.xml"); UMat frame, frameGray; vector<Rect> faces; for(;;){ vcap >> frame; cvtColor(frame, frameGray, BGR2GRAY); equalizeHist(frameGray, frameGray); fd.detectMultiScale(frameGray , faces); } OCV 3.0: Face detect Anywhere! This code will run, and configure itself differently on different platforms! The Need for a Transparent API
  17. 17. Copyright © 2014 AMD 17 UMat: UMatData: Reference counts Dirty bits Opaque handles (e.g. clBuffer) CPU data GPU data Handles data synchronization efficiently Mat: getMat(…) getUMat(…) • Easy transition path from 2.x to 3.x. Code that used to work in 2.x, should still work. Therefore, cv::Mat is still around. Both Mat and UMat are views into UMatData, which does the heavy lifting How does OpenCV 3.0 T-API work?
  18. 18. Copyright © 2014 AMD 18 • OpenCL™ provides a non-proprietary API suitable for image processing and vision applications, that works well on multiple platforms • OpenCL 2.0 and HSA enable efficient collaboration between CPU and GPU cores, on equal footing. An evolution that can only be compared to the one from single core to multi-core CPUs! • OpenCV contains lots of OpenCL examples that can be a great starting point for your own projects. Join the Open Standards evolution! Concluding Thoughts
  19. 19. Copyright © 2014 AMD 19 The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2014 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. OpenCL is a trademark of Apple Inc. used by permission by Khronos. Other names are for informational purposes only and may be trademarks of their respective owners. Disclaimer & Attribution

×