Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ONNC - 0.9.1 release


Published on

ONNC design principle and community operation

Published in: Technology
  • Be the first to comment

ONNC - 0.9.1 release

  1. 1. connect ONNX to every deep learning accelerator
  2. 2. executablility CPU GPU DSP DLA
  3. 3. traditional compiler heterogeneous architecture system (HSA) single architecture PetriNet(1) CFG and DFG target programming model IR type ONNX IR (multiple outputs) three address code (single output) physical feature depends on operand opcode
  4. 4. Assumption of target systems • Accelerators are more effective than processors • Processors are more flexible than accelerators • If the communication cost is less than the computation cost, than the task will reside in accelerator • All tasks start from the top level processor CPU DSP DLA flexible effective
  5. 5. CPU DSP DLA CONV load store load store CONV CONV compulsory spill compulsory spill cost effective flexible Compulsory Spill is easy to implement in the other compiler framework
  6. 6. CPU DSP DLA CONV load store load store X store load Y store load memory spill eliminate them in compiler Memory Spill is what we already have in every compiler framework
  7. 7. Z CPU DSP DLA CONV load store load store X store load store Y load store load W store load Z Z operator spill Operator Spill is totally new and required for every accelerators
  8. 8. What a compiler should do when an operator spill occurs? 1. push the operator to upper device 2. split the operator 3. give up this compilation and retry In many cases, option 3 is the only possible solution
  9. 9. traditional compiler heterogeneous architecture system (HSA) single architecture ITERATIVE sequential target compilation model Lattice D BA C Add D, PassManager will add A and B automatically A B D C topologic sort retry
  10. 10. traditional compiler Limited DLA save 377% in avg. paging systemMemory constraint randnet_ manual/t est2 CaffeNet LeNet yolo9000 AlexNet R-CNN- ilsvrc13 yolov1 FlickrStyl eCaffeNe t VGG_ILSV RC_19_la yer VGG_ILSV RC_16_la yer yolov2- tiny yolov1- tiny Ratio (origin size / new size) 361.25% 263.58% 120.83% 615.86% 312.82% 263.34% 1079.96% 264.32% 554.49% 494.60% 443.97% 408.18% 0.00% 200.00% 400.00% 600.00% 800.00% 1000.00% 1200.00% Ratio (origin size / new size)
  11. 11. Connect to both LLVM and ASIC No porting effort for LLVM compiler Support complex ASIC design
  12. 12. for porters for developers for testers Projects reside in The Regression project The Umbrella project
  13. 13. How to contribute I have a question I have a wish Ask questions in the mailing list Is the wish specific? Is it a long wish? yes Make an issue no no yes
  14. 14. Current Status 0.9.1 1.0.0 ~8/24 Next release often release; fast Iterate (3~4 weeks a release interval)
  15. 15. Give me 罐罐 and Stars, please