Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ChainerX and How to Take Part

1,398 views

Published on

Introduction of ChainerX, its architectural overview, internals and how its incorporated in Chainer.

Published in: Software
  • You might get some help from ⇒ www.HelpWriting.net ⇐ Success and best regards!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Don't forget another good way of simplifying your writing is using external resources (such as ⇒ www.HelpWriting.net ⇐ ). This will definitely make your life more easier
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Have u ever tried external professional writing services like ⇒ www.HelpWriting.net ⇐ ? I did and I am more than satisfied.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

ChainerX and How to Take Part

  1. 1. ChainerX and How to Take Part Hiroyuki Vincent Yamazaki, @hvy @ Preferred Networks. Mar. 30, 2019. Chainer Meetup #09 @ Preferred Networks.
  2. 2. What makes a modern deep learning framework?
  3. 3. • Speed • Fast trial-and-error • Fast training and inference • Environment Support • Quick adoption of new hardwares/environments • Quick Deployment • Quick application of research outcome Chainer
  4. 4. • Speed • Fast trial-and-error • Fast training and inference • Environment Support • Quick adoption of new hardwares/environments • Quick Deployment • Quick application of research outcome Chainer ChainerX
  5. 5. • how it makes Chainer a modern deep learning framework • how it started and where it is heading • how to contribute to it This talk is about ChainerX and...
  6. 6. • understand ChainerX and some of its internals • are ready to try ChainerX • be curious to modify it to your needs You hopefully after this talk...
  7. 7. What is ChainerX? A NumPy-like ndarray library with autograd, built from scratch with experiences from Chainer
  8. 8. • Subproject of Chainer started in late 2017 • With both internal and external Chainer developers • Merged into master as of v6.0.0b1 and will be included in v6 https://github.com/chainer/chainer/tree/master/chainerx https://github.com/chainer/chainer/tree/master/chainerx_cc How it started @beam2d @niboshi @asi1024 @hvy @sonots @takagi
  9. 9. import chainerx as chx # Array creation, chx.ndarray, similar to NumPy x = chx.ones((2, 3), dtype=chx.float32, device='native') # Flag to record computational graph x.require_grad() # Define-by-run/eager forward pass, again similar to NumPy y = chx.exp(x + 1).sum() # Backpropagation chx.backward(y) # Computed gradient is also a chx.ndarray gx = x.grad
  10. 10. chainerx.add chainerx.amax chainerx.arange chainerx.argmax chainerx.array chainerx.asanyarray chainerx.asarray chainerx.ascontiguousarray chainerx.average_pool chainerx.batch_norm chainerx.broadcast_to chainerx.clip chainerx.concatenate chainerx.conv chainerx.conv_transpose chainerx.copy chainerx.diag chainerx.diagflat chainerx.divide chainerx.dot chainerx.empty chainerx.empty_like chainerx.equal chainerx.exp chainerx.eye chainerx.fixed_batch_norm chainerx.floor_divide chainerx.frombuffer chainerx.fromfile chainerx.fromfunction chainerx.fromiter chainerx.fromstring chainerx.full chainerx.full_like chainerx.greater chainerx.greater_equal chainerx.hstack chainerx.identity chainerx.isfinite chainerx.isinf chainerx.isnan chainerx.less chainerx.less_equal chainerx.linear chainerx.linspace chainerx.loadtxt chainerx.log chainerx.log_softmax chainerx.logical_not chainerx.logsumexp chainerx.max chainerx.max_pool chainerx.maximum chainerx.minimum chainerx.multiply chainerx.ndarray chainerx.negative chainerx.not_equal chainerx.ones chainerx.ones_like chainerx.ravel chainerx.relu chainerx.reshape chainerx.sigmoid chainerx.split chainerx.sqrt chainerx.square chainerx.squeeze chainerx.stack chainerx.subtract chainerx.sum chainerx.take chainerx.tanh chainerx.to_numpy chainerx.transpose chainerx.true_divide chainerx.vstack chainerx.zeros chainerx.zeros_like chainerx.activation chainerx.creation chainerx.random chainerx.manipulation chainerx.math chainerx.dtype chainerx.bool chainerx.bool_ chainerx.float chainerx.float16 chainerx.float32 chainerx.float64 chainerx.int chainerx.int16 chainerx.int32 chainerx.int64 chainerx.int8 chainerx.uint8 chainerx.all_dtypes chainerx.Context chainerx.ContextScope chainerx.Backend chainerx.BackpropId chainerx.BackpropScope chainerx.Device chainerx.DeviceScope chainerx.ForceBackpropMode chainerx.NoBackpropMode chainerx.grad chainerx.backprop_scope chainerx.backward chainerx.check_backward chainerx.check_double_backward chainerx.context_scope chainerx.force_backprop_mode chainerx.get_backend chainerx.get_default_context chainerx.get_default_device chainerx.get_device chainerx.is_available chainerx.is_backprop_required chainerx.no_backprop_mode chainerx.set_default_context chainerx.using_device chainerx.newaxis …
  11. 11. Why ChainerX? Speed, environment support and quick deployment
  12. 12. • Written in C++ • Speed • No Python runtime required for deployment • Python binding on top • Lightweight • 1-to-1 C++ mappings • Pluggable backends • Extensible to new hardwares/environments Autograd Backpropable ndarray CUDA Backend/ Device Native Backend/ Device Python binding Backend/Device interface Custom Backend/ Device ...
  13. 13. #include "chainerx.h" namespace chx = chainerx; chx::Array x = chx::Ones( {2, 3}, chx::Dtype::kFloat32, chx::GetDevice("native")); x.RequireGrad(); chx::Array y = chx::Exp(x + 1).Sum(); chx::Backward(y); chx::Array gy = *x.GetGrad(); C++ API import chainerx as chx x = chx.ones( (2, 3), dtype=chx.float32, device='native') x.require_grad() y = chx.exp(x + 1).sum() chx.backward(y) gx = x.grad Python API
  14. 14. ChainerX internals Explaining basic types and functions
  15. 15. // Call a routine to create a graph. Internally uses chx::BackwardBuilder to do so chx::Array y = chx::Conv(x, w, b, {1, 1}, {1, 1}); Array, x ArrayBody Array, w ArrayBody Array, b ArrayBody ArrayNode ArrayNode ArrayNode OpNode, Conv Array, y ArrayBody ArrayNode chainerx namespace omitted for clarity // Flag to record computational graph x.RequireGrad(); w.RequireGrad(); b.RequireGrad(); // Create input ndarrays chx::Array x = ... chx::Array w = ... chx::Array b = ...
  16. 16. chainerx::Array (chainerx::ArrayBody) • Core data type in ChainerX, an ndarray with autograd • Has ndarray properties such as • pointer to allocated data, shape, dtype, strides • Associated with a single device • Data resides on e.g. "native" or "cuda:2" • Holds references to its • gradients, also chainerx::Arrays • nodes in the computational graphs Array, x device ArrayBody data Array, gx ArrayNode ArrayBody
  17. 17. chainerx::ArrayNode • A node representing an array in the computational graph • Owned by chainerx::ArrayBody Array, x ArrayBody Array, w ArrayBody Array, b ArrayBody ArrayNode ArrayNode ArrayNode OpNode, Conv Array, y ArrayBody ArrayNode
  18. 18. chainerx::OpNode • A node representing an operation in the computational graph • Referenced by chainerx::ArrayNode Array, x ArrayBody Array, w ArrayBody Array, b ArrayBody ArrayNode ArrayNode ArrayNode OpNode, Conv Array, y ArrayBody ArrayNode
  19. 19. • An array is constructed by specifying the allocating device chainerx::Device& gpu = chainerx::GetDevice("cuda:0"); chainerx::Array x = chainerx::Ones({2, 3}, chainerx::Dtype::kFloat32, gpu); • A device defines • how memory is allocated and freed • chainerx::Device::Allocate • operations on data • chainerx::Device::{ Fill,Arange,Add,Subtract,Multiply,Divide,Sum,Dot,...} chainerx::Device (1/2)
  20. 20. chainerx::Device (2/2) • chainerx::Device is an interface • Concrete implementations provided by ChainerX • chainerx::native::NativeDevice • chainerx::cuda::CudaDevice • Can be implemented for other devices and dynamically loaded as shared libraries
  21. 21. Routines (1/2) • Backpropable autograd operations on chainerx::Arrays • chainerx::{ Add,Subtract,Multiply,Divide, Sum,Transpose,Reshape,Dot, Conv,BatchNorm,MaxPool,...}
  22. 22. Routines (2/2) • Defines forward and backward logic using chainerx::BackwardBuilder • Delegates actual computations to the device methods • chainerx::Dot calls chainerx::Device::Dot Array Dot(const Array& a, const Array& b, Dtype dtype) { int64_t m = a.shape()[0]; int64_t k = a.shape()[1]; int64_t n = b.shape()[1]; Array out = Empty({m, n}, dtype, a.device()); { NoBackpropModeScope scope{}; a.device().Dot(a, b, out); } { BackwardBuilder bb{"dot", {a, b}, out}; if (BackwardBuilder::Target bt = bb.CreateTarget(0)) { bt.Define([b_tok = bb.RetainInput(1), a_dtype = a.dtype()](BackwardContext& bctx) { const Array& b = bctx.GetRetainedInput(b_tok); bctx.input_grad() = Dot(*bctx.output_grad(), b.Transpose(), a_dtype); }); } if (BackwardBuilder::Target bt = bb.CreateTarget(1)) { bt.Define([a_tok = bb.RetainInput(0), b_dtype = b.dtype()](BackwardContext& bctx) { const Array& a = bctx.GetRetainedInput(a_matrix_tok); bctx.input_grad() = Dot(a.Transpose(), *bctx.output_grad(), b_dtype); }); } bb.Finalize(); } return out; }
  23. 23. Chainer integration How ChainerX can be used from Chainer
  24. 24. Architecture Variable and functions APIs Autograd Backpropable ndarray CUDA Backend/ Device Native Backend/ Device Python binding Backend/Device interface Custom Backend/ Device ... Training and model APIs CuPy Autograd NumPy • Various APIs in Chainer v6 work with and utilize chainerx • Variable and FunctionNode delegates autograd computations to ChainerX
  25. 25. Chainer import chainer as ch import cupy as cp class ResNet50(ch.Chain): … model = ResNet50() model.to_device(0) arr = cp.array(...) x = ch.Variable(arr) y = model(x) loss = … loss.backward() Autograd Backpropable ndarray CUDA Backend/ Device Native Backend/ Device Python binding Backend/Device interface Custom Backend/ Device ... Training and model APIs CuPy Variable and functions APIs CuPy Autograd NumPy
  26. 26. Chainer on ChainerX import chainer as ch import chainerx as chx class ResNet50(ch.Chain): … model = ResNet50() model.to_device('cuda:0') arr = chx.array(...) x = ch.Variable(arr) y = model(x) loss = … loss.backward() Training and model APIs CuPy Variable and functions APIs CuPy Autograd NumPy Autograd Backpropable ndarray CUDA Backend/ Device Native Backend/ Device Python binding Backend/Device interface Custom Backend/ Device ...
  27. 27. How to take part in developing ChainerX Contribution guide explained
  28. 28. It’s all documented • A section in the Chainer documentation https://docs.chainer.org/en/latest/chainerx/index.html • On GitHub • Look for issues/PRs labeled • ChainerX needs to support more routines • A list of unimplemented routines https://github.com/chainer/chainer/issues/6423 contribution-welcomeChainerX
  29. 29. Future of ChainerX
  30. 30. Future roadmap • Integrate into Chainer • Wider range of supported routines • Dynamic device operation registration • Concrete third party backends • Stable C++ interface • Wider coverage of “compiled models”
  31. 31. Summary ChainerX is implemented in C++ with far less host-side overhead, made accessible to Python-free deployments and allows third parties to implement backends and devices for hardware/environment support Taking Chainer to the next level by being accessible via Python and used by Chainer
  32. 32. and you can take part of ChainerX on GitHub Contributions, ideas and discussions are welcome • Follow @ChainerOfficial on Twitter • Join chainer on Slack • Job application to https://www.preferred-networks.jp/en/jobs We are hiring
  33. 33. Additional resources • ChainerX documentation • ChainerX Product Backlog • ChainerX examples (MLP, ResNet50) • ChainerX Python bindings • ChainerX C++ Backpropagation

×