Your SlideShare is downloading. ×
0
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
MulticoreWare Inc - Accelerating Video and Imaging Applications
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

MulticoreWare Inc - Accelerating Video and Imaging Applications

570

Published on

MCW currently works in a number compute intense application areas such as HEVC, VP9, OpenCV, imaging, video, codecs, H.264, H.265, broadcast video, ADAS. …

MCW currently works in a number compute intense application areas such as HEVC, VP9, OpenCV, imaging, video, codecs, H.264, H.265, broadcast video, ADAS.

Languages include OpenCL, CUDA, RenderScript, C/C++, Assembly (Intel, ARM)

Platforms supported include Intel, AMD, ARM, Qualcomm, Imagination, across CPU, GPU and DSP for these platforms.

A deep bench in LLVM technology with significant compiler optimization is also available for licensing and customization.

Published in: Technology, Art & Photos
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
570
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Global Team     Largest Independent OpenCL Team Founded in 2008 225 Strong and Growing High Ratio of PhDs, Masters Chennai St. Louis Parallel Processing Leaders Champaign Sunnyvale Changchun Beijing Dr. Wen Mei-Hwu, MCW CTO and PI for the UIUC Blue Waters Supercomputer accepts the Second Annual Achievement Award at GTC 2013 Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 2
  • 2. Industry Leadership  Tools leadership role on HSA Foundation  Khronos Contributor Member   Strategic Relationship with University of Illinois at Urbana-Champaign, USA Partnerships with CPU/GPU/FPGA Vendors Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 3
  • 3. Capabilities COMPLETE TOOLS Exploration Analysis Performance Tuning Source-to-Source Translation Renderscript WORLD CLASS LEADING R&D TEAM ALGORITHMS PROFESSIONAL SERVICES Image Processing Client-specific Video Processing Customized Video Transcoding Cryptography Solutions Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 4
  • 4. Professional Services  Parallel Processing tools • Complete OpenCL stack for AMD Fusion • C++ AMP • Renderscript   Clients globally have used MulticoreWare to maximize performance and portability of their software Video Encoding Video Processing • Scaling, color space conversion • Resizing and rate-conversion • De-interlacing and re-interlacing   Video Game Engine Acceleration Image Processing • Semiconductor wafer defect inspection • Raster Image Processor engine parallelization  Bioinformatics • Accelerated BLAST algorithm for gene sequencing • 3500X faster than NIH reference model Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 5
  • 5. Domain Expertise          Video Processing Video Transcoding Video Game Engines Image Processing Medical Imaging Seismic data analysis Compression Encryption Fluid Dynamics  Compilers (LLVM)  Device drivers Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 6
  • 6. Platform Expertise  Video and Imaging implementations done across many platforms  Experience across heterogeneous compute platforms • Mobile device platforms to workstations and cloud based platforms • x86 Assembly Code optimization • ARM Mali and NEON optimization  Experience across heterogeneous programming models • CUDA • OpenCL • Renderscript • C++AMP • MARE • HSA Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 7
  • 7. Video Expertise  X264 – Open Source H.264 Encoder accelerated for Telestream’s Vantage Encoder  - MulticoreWare’s H.265 Encoder  - MulticoreWare’s H.265 Decoder  VP9 Acceleration  Accelerated Video Processing Library – Super – resolution, image stabilization, detection and recognition  Handbrake  FFMPEG  VLC Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 8
  • 8. HEVC – Commercially Supported Open Source  Compute intensive • Larger block size 64x64 Vs 16x16 in H.264 • More transform sizes • New Intra prediction modes • Quad tree structure in processing Coding Unit(CU) • Sample Adaptive Offset (SAO) filter in addition to deblocking filter  New ideas to facilitate parallel processing of data – Tiles, WPP Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 9
  • 9. Renderscript  MCW • Developed Renderscript infrastructure for ARM Mali • Developed 2 marquee APKs using ARM A15 & Mali  Photo processing 2-15x speedup over ARM core  Video transcoder with filtering and motion stabilization  Working closely with Google • Enabling VP9 video codec Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 10
  • 10. Image Processing Expertise  Cinema DNG (debayering, noise reduction, etc.)  GIMP/GEGL – open source PhotoShop alternative • • • • • MCW parallelized GIMP Accelerated kernels for color space conversion Improved calling and data transfer mode between GIMP and GEGL Streamlined redundant operations for improved efficiency of image processing More than 20 algorithms (e.g. image scaling, Brightness/contrast control, gamma correction, edge enhancement, color correction, etc.) implemented  JPEG in browser • • • • Implemented accelerated libjpeg-turbo for JPEG decoding Integration of libjpeg-turbo in Chromium. Implementation of parallel progressive mode JPEG decoding Implementation of Huffman decoding algorithm  OpenCV • • Performance optimized and author of many functions Key contributor Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 11
  • 11. OpenCV Functions in OpenCL               lut Exp Log Add Mul Div Absdiff CartToPolar PolarToCart magnitude Transpose Flip minMax minMaxLoc                  Sum countNonZero Phase bitwise_and bitwise_not compare pow MagnitudeSqr AddWeighted blend BruteForceMatcher StereoMatchBM Canny cvtColor Blur Laplacian Erode                  Sobel Scharr GaussianBlur filter2D gemm Haar HOG equalizeHist CopyMakeBorder cornerMinEigenVal cornerHarris integral WarpAffine WarpPerspective resize threshold meanShiftFiltering Copyrights 2014, Confidential, MulticoreWare Inc.,                 meanShiftProc remap CLAHE columnSum matchTemplate ConvertTo copyTo setTo Moments norm PyrLKOpticalFlow tvl1flow pyrDown pyrUp Merge Split February 3, 2014 12
  • 12. Automotive Algorithms in OpenCV  Lane keeping • Canny  AEB • HOG • Haar • Optical flow  Traffic sign recognition • Hough transform • Haar • SURF  Driver monitor • Face/eye detect/tracking  Pedestrian detection and avoidance • HOG • StereoMatch Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 13
  • 13. Other OpenCV Algorithms  MCW - lead contributor of OpenCL-accelerated OpenCV • • • • • • • Face detection HOG pedestrian detection PyrLK/TVL1 optical flow show Square detection SURF matcher Stereo matcher CLAHE  Extensive optimizations are applied to these algorithms Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 14
  • 14. Mobile Application Acceleration  MulticoreWare MobileComputeMark Android benchmark App  Parallel Path Analyzer for Android  Renderscript / OpenCL Stack Development  Photo editing App for ARM  Video transcode App for ARM Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 15
  • 15. Accelerated Libraries OpenCL  VPL • • Video Processing Library Nearly 80 video kernels for broadcast standards-conversion Other Languages  VP9 Codec for Google  RenderScript  OpenCL  VFL • Video pre-processing Filters Library  IPL • Image Processing Library  XAL • H.264 Acceleration Library  Crypto • • Crypto++ AES  Compression • XXX_Zip Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 16
  • 16. PPA – Parallel Path Analyzer  A performance-visualization tool to identify performance bottlenecks, application critical paths and system-wide dependencies.  Provides flexible, globally time-stamped, runtime data collection and post-processing procedures to generate meaningful performance analysis results and display them in intuitive graphical and textual ways. Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 17
  • 17. MxPA Source-2-Source Translator  OpenCL to C on Intel X86, OpenMP, others *… • Maintain a common code base in OpenCL • Support OpenCL enabled devices or go direct to other compilers as needed  Generates C source code for vendor specific compiler tools • Integrated code sequencing and resource utilization for highest performance • Highest performance automated code generation method available today • Takes advantage of Intel SSE, TBB Translation  Close to ASM code performance out of box • No need for OpenCL driver support • Leverages silicon vendor tool optimizations * = NDA needed for more details upcrc.illinois.edu Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 18
  • 18. Thank You! Contact : info@techrevllc.com +1-844-TECHREV Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 19

×