Global Team





Largest Independent OpenCL Team
Founded in 2008
225 Strong and Growing
High Ratio of PhDs, Masters
Ch...
Industry Leadership
 Tools leadership role on HSA Foundation
 Khronos Contributor Member




Strategic Relationship wi...
Capabilities
COMPLETE
TOOLS
Exploration
Analysis
Performance
Tuning
Source-to-Source
Translation

Renderscript

WORLD CLAS...
Professional Services


Parallel Processing tools
• Complete OpenCL stack for AMD Fusion
• C++ AMP
• Renderscript




C...
Domain Expertise










Video Processing
Video Transcoding
Video Game Engines
Image Processing
Medical Imaging...
Platform Expertise


Video and Imaging implementations done across many platforms



Experience across heterogeneous com...
Video Expertise
 X264 – Open Source H.264 Encoder accelerated for Telestream’s Vantage
Encoder


- MulticoreWare’s H.265...
HEVC – Commercially Supported Open Source
 Compute intensive
• Larger block size 64x64 Vs 16x16 in H.264
• More transform...
Renderscript
 MCW
• Developed Renderscript infrastructure for ARM Mali
• Developed 2 marquee APKs using ARM A15 & Mali
 ...
Image Processing Expertise
 Cinema DNG (debayering, noise reduction, etc.)
 GIMP/GEGL – open source PhotoShop alternativ...
OpenCV Functions in OpenCL
















lut
Exp
Log
Add
Mul
Div
Absdiff
CartToPolar
PolarToCart
magnitude...
Automotive Algorithms in OpenCV


Lane keeping
• Canny



AEB
• HOG
• Haar
• Optical flow



Traffic sign recognition
•...
Other OpenCV Algorithms
 MCW - lead contributor of OpenCL-accelerated OpenCV
•
•
•
•
•
•
•

Face detection
HOG pedestrian...
Mobile Application Acceleration
 MulticoreWare MobileComputeMark
Android benchmark App
 Parallel Path Analyzer for Andro...
Accelerated Libraries
OpenCL
 VPL
•
•

Video Processing Library
Nearly 80 video kernels for broadcast
standards-conversio...
PPA – Parallel Path Analyzer
 A performance-visualization tool to identify performance bottlenecks, application critical ...
MxPA Source-2-Source Translator
 OpenCL to C on Intel X86, OpenMP, others

*…

• Maintain a common code base in OpenCL
• ...
Thank You!
Contact : info@techrevllc.com
+1-844-TECHREV
Copyrights 2014, Confidential, MulticoreWare Inc.,

February 3, 20...
MulticoreWare Inc - Accelerating Video and Imaging Applications
Upcoming SlideShare
Loading in …5
×

MulticoreWare Inc - Accelerating Video and Imaging Applications

1,104 views

Published on

MCW currently works in a number compute intense application areas such as HEVC, VP9, OpenCV, imaging, video, codecs, H.264, H.265, broadcast video, ADAS.

Languages include OpenCL, CUDA, RenderScript, C/C++, Assembly (Intel, ARM)

Platforms supported include Intel, AMD, ARM, Qualcomm, Imagination, across CPU, GPU and DSP for these platforms.

A deep bench in LLVM technology with significant compiler optimization is also available for licensing and customization.

Published in: Technology, Art & Photos
  • Be the first to comment

MulticoreWare Inc - Accelerating Video and Imaging Applications

  1. 1. Global Team     Largest Independent OpenCL Team Founded in 2008 225 Strong and Growing High Ratio of PhDs, Masters Chennai St. Louis Parallel Processing Leaders Champaign Sunnyvale Changchun Beijing Dr. Wen Mei-Hwu, MCW CTO and PI for the UIUC Blue Waters Supercomputer accepts the Second Annual Achievement Award at GTC 2013 Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 2
  2. 2. Industry Leadership  Tools leadership role on HSA Foundation  Khronos Contributor Member   Strategic Relationship with University of Illinois at Urbana-Champaign, USA Partnerships with CPU/GPU/FPGA Vendors Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 3
  3. 3. Capabilities COMPLETE TOOLS Exploration Analysis Performance Tuning Source-to-Source Translation Renderscript WORLD CLASS LEADING R&D TEAM ALGORITHMS PROFESSIONAL SERVICES Image Processing Client-specific Video Processing Customized Video Transcoding Cryptography Solutions Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 4
  4. 4. Professional Services  Parallel Processing tools • Complete OpenCL stack for AMD Fusion • C++ AMP • Renderscript   Clients globally have used MulticoreWare to maximize performance and portability of their software Video Encoding Video Processing • Scaling, color space conversion • Resizing and rate-conversion • De-interlacing and re-interlacing   Video Game Engine Acceleration Image Processing • Semiconductor wafer defect inspection • Raster Image Processor engine parallelization  Bioinformatics • Accelerated BLAST algorithm for gene sequencing • 3500X faster than NIH reference model Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 5
  5. 5. Domain Expertise          Video Processing Video Transcoding Video Game Engines Image Processing Medical Imaging Seismic data analysis Compression Encryption Fluid Dynamics  Compilers (LLVM)  Device drivers Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 6
  6. 6. Platform Expertise  Video and Imaging implementations done across many platforms  Experience across heterogeneous compute platforms • Mobile device platforms to workstations and cloud based platforms • x86 Assembly Code optimization • ARM Mali and NEON optimization  Experience across heterogeneous programming models • CUDA • OpenCL • Renderscript • C++AMP • MARE • HSA Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 7
  7. 7. Video Expertise  X264 – Open Source H.264 Encoder accelerated for Telestream’s Vantage Encoder  - MulticoreWare’s H.265 Encoder  - MulticoreWare’s H.265 Decoder  VP9 Acceleration  Accelerated Video Processing Library – Super – resolution, image stabilization, detection and recognition  Handbrake  FFMPEG  VLC Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 8
  8. 8. HEVC – Commercially Supported Open Source  Compute intensive • Larger block size 64x64 Vs 16x16 in H.264 • More transform sizes • New Intra prediction modes • Quad tree structure in processing Coding Unit(CU) • Sample Adaptive Offset (SAO) filter in addition to deblocking filter  New ideas to facilitate parallel processing of data – Tiles, WPP Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 9
  9. 9. Renderscript  MCW • Developed Renderscript infrastructure for ARM Mali • Developed 2 marquee APKs using ARM A15 & Mali  Photo processing 2-15x speedup over ARM core  Video transcoder with filtering and motion stabilization  Working closely with Google • Enabling VP9 video codec Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 10
  10. 10. Image Processing Expertise  Cinema DNG (debayering, noise reduction, etc.)  GIMP/GEGL – open source PhotoShop alternative • • • • • MCW parallelized GIMP Accelerated kernels for color space conversion Improved calling and data transfer mode between GIMP and GEGL Streamlined redundant operations for improved efficiency of image processing More than 20 algorithms (e.g. image scaling, Brightness/contrast control, gamma correction, edge enhancement, color correction, etc.) implemented  JPEG in browser • • • • Implemented accelerated libjpeg-turbo for JPEG decoding Integration of libjpeg-turbo in Chromium. Implementation of parallel progressive mode JPEG decoding Implementation of Huffman decoding algorithm  OpenCV • • Performance optimized and author of many functions Key contributor Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 11
  11. 11. OpenCV Functions in OpenCL               lut Exp Log Add Mul Div Absdiff CartToPolar PolarToCart magnitude Transpose Flip minMax minMaxLoc                  Sum countNonZero Phase bitwise_and bitwise_not compare pow MagnitudeSqr AddWeighted blend BruteForceMatcher StereoMatchBM Canny cvtColor Blur Laplacian Erode                  Sobel Scharr GaussianBlur filter2D gemm Haar HOG equalizeHist CopyMakeBorder cornerMinEigenVal cornerHarris integral WarpAffine WarpPerspective resize threshold meanShiftFiltering Copyrights 2014, Confidential, MulticoreWare Inc.,                 meanShiftProc remap CLAHE columnSum matchTemplate ConvertTo copyTo setTo Moments norm PyrLKOpticalFlow tvl1flow pyrDown pyrUp Merge Split February 3, 2014 12
  12. 12. Automotive Algorithms in OpenCV  Lane keeping • Canny  AEB • HOG • Haar • Optical flow  Traffic sign recognition • Hough transform • Haar • SURF  Driver monitor • Face/eye detect/tracking  Pedestrian detection and avoidance • HOG • StereoMatch Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 13
  13. 13. Other OpenCV Algorithms  MCW - lead contributor of OpenCL-accelerated OpenCV • • • • • • • Face detection HOG pedestrian detection PyrLK/TVL1 optical flow show Square detection SURF matcher Stereo matcher CLAHE  Extensive optimizations are applied to these algorithms Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 14
  14. 14. Mobile Application Acceleration  MulticoreWare MobileComputeMark Android benchmark App  Parallel Path Analyzer for Android  Renderscript / OpenCL Stack Development  Photo editing App for ARM  Video transcode App for ARM Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 15
  15. 15. Accelerated Libraries OpenCL  VPL • • Video Processing Library Nearly 80 video kernels for broadcast standards-conversion Other Languages  VP9 Codec for Google  RenderScript  OpenCL  VFL • Video pre-processing Filters Library  IPL • Image Processing Library  XAL • H.264 Acceleration Library  Crypto • • Crypto++ AES  Compression • XXX_Zip Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 16
  16. 16. PPA – Parallel Path Analyzer  A performance-visualization tool to identify performance bottlenecks, application critical paths and system-wide dependencies.  Provides flexible, globally time-stamped, runtime data collection and post-processing procedures to generate meaningful performance analysis results and display them in intuitive graphical and textual ways. Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 17
  17. 17. MxPA Source-2-Source Translator  OpenCL to C on Intel X86, OpenMP, others *… • Maintain a common code base in OpenCL • Support OpenCL enabled devices or go direct to other compilers as needed  Generates C source code for vendor specific compiler tools • Integrated code sequencing and resource utilization for highest performance • Highest performance automated code generation method available today • Takes advantage of Intel SSE, TBB Translation  Close to ASM code performance out of box • No need for OpenCL driver support • Leverages silicon vendor tool optimizations * = NDA needed for more details upcrc.illinois.edu Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 18
  18. 18. Thank You! Contact : info@techrevllc.com +1-844-TECHREV Copyrights 2014, Confidential, MulticoreWare Inc., February 3, 2014 19

×