Chainer Update v1.8.0 -> v1.10.0+

45,861 views

Published on

Updates in Chainer v1.8.0 through v1.10.0, and planned features in the upcoming versions. The slides are used at Chainer Meetup #03 in Tokyo.

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
45,861
On SlideShare
0
From Embeds
0
Number of Embeds
43,153
Actions
Shares
0
Downloads
25
Comments
0
Likes
10
Embeds 0
No embeds

No notes for slide

Chainer Update v1.8.0 -> v1.10.0+

  1. 1. Chainer Update v1.8.0 -> v1.10.0+ Chainer Meetup #03 @ Dwango Seiya Tokui, Preferred Networks, Inc. 2016/07/02
  2. 2. Updates v1.8.0 -> v1.10.0 2
  3. 3. Many updates since the last meetup  v1.8.0 (April 12)  v1.9.0 (May 31)  v1.10.0 (June 28)  Many contributions from the community. Thank you very much!!!!! (30 PRs from non-PFI/PFNers have been merged since the last meetup in March) 3
  4. 4. New core features  CaffeFunction improved: support Py3, support the ResNet models  Weight initializer – Most links start supporting initializer to initialize the parameters – Sample: import chainer.links as L, chainer.initializers as I L.Linear(784, 1000, initialW=I.Normal(0.01), initialb=I.Constant(0.1)) – Many built-in initializers – You can also write your own initializer easily (it is just a function/callable that initializes the elements of a given CPU array)  Support float16 and float64 in many Functions  CuPy Profiling API 4
  5. 5. New core feature (cont.)  ndarray arguments for Functions – Function now accepts NumPy/CuPy ndarrays as arguments – It automatically wraps the arrays by Variable – Users do not have to wrap arrays by Variable manually 5
  6. 6. Many Functions/Links are added  Variable.__getitem__ (F.get_item) – Variable now supports the basic indexing (cf. NumPy) with backprop – It supports: integer indexing, slice indexing, newaxis, Ellipsis – E.g.: slice indexing can be used to crop feature maps  Array manipulation: cast, clip, log1p, expm1, logsumexp, minimum, permutate  NN elements: huber_loss, hard_sigmoid, roi_pooling_2d, StatelessLSTM, Bias, Scale  There are also many updates on existing Functions/Links (new options, bug fixes, etc.)  See the recent release notes for the full list of new Functions/Links https://github.com/pfnet/chainer/releases 6
  7. 7. New CuPy functions  cupy.nonzero – Enumerates the indices of non-zero elements – It implements inclusive scan kernel in CUDA – We are willing to support a wider range of routines that require the scan kernel (like cumsum)  In parallel, some routines using nonzero are added: cupy.ix_, cupy.flatnonzero  Profiling API (cupy.cuda.profile, cupy.cuda.profiler) to enable CUDA profile collection only in a specified range of codes 7
  8. 8. Issue: slow merge of PRs. Solution: more minor releases  As I noted, there are now many PRs coming to the Chainer repository  The current release cycle of minor updates per 6 weeks is too slow  We decided to make minor releases more frequently Until v1.9.0:  Revision release (without new features) for every 2 weeks  Minor release (with new features) for every 6 weeks From v1.10.0:  Release for every 2 weeks  Any release can contain new features (we increment the minor version in that case) 8
  9. 9. Planned updates after v1.10.0 9
  10. 10. Planned big features for upcoming releases (v1.11-12)  Dataset and Trainer (will be explained)  cuDNN RNN support  Theano function support (use a Theano function as a Chainer Function)  Asynchronous to_gpu
  11. 11. Target Link Dataset Optimizer Iterator Dataset and Trainer  Dataset: Abstraction of iterations over datasets  Trainer: Abstraction of training loops 11 Trainer Extension Extension Extension Updater Optimizer Optimizer Target Link Target Link Iterator Iterator Dataset Dataset We often use only one optimizer and one dataset. This diagram shows a general case.
  12. 12. Trainer  Call the Updater and Extensions for every iteration  Updater – Fetch a mini-batch using Iterator, and update parameters using Optimizer – You can customize the update routine – Built-in updater: StandardUpdater, ParallelUpdater (under review) (ParallelUpdater provides an easy way of data-parallel learning)  Extension – It adds an extra routine to the training loop – Basic extensions are built-in: Evaluator, LogReport, PrintReport, ProgressBar snapshot, snapshot_object, ExponentialDecay, LinearShift dump_graph – You can write your own extensions 12
  13. 13. Dataset / Iterator  Dataset is just a sequence of data points (a.k.a. examples)  Iterator defines how to iterate over the dataset  Built-in iterators: – SequentialIterator – ShuffledIterator – MultiprocessIterator (you can easily support multiprocess preprocessing with it) 13
  14. 14. Reporter: easy way to report an observation Trainer uses Reporter to collect observations (e.g. loss value, accuracy, activation statistics, etc.) Example (simple Classifier): class Classifier(chainer.Chain): def __init__(self, predictor): super(Classifier, self).__init__(predictor=predictor) def __call__(self, x, t): y = self.predictor(x) loss = F.softmax_cross_entropy(y, t) accuracy = F.accuracy(y, t) chainer.report({‘loss’: loss, ‘accuracy’: accuracy}, observer=self) return loss 14
  15. 15. MNIST Example 15 ・・・ Dataset and Iterators Updater and Trainer Extensions Launch training loop Model and Optimizer
  16. 16. Note on Trainer  If your training workflow is very different from standard ones, you can still write your own training loop  We recommend you to use Trainer for newly-written training scripts – Most usages are covered by the Trainer – You can flexibly customize each component of Trainer 16
  17. 17. We are planning the first major version up!  Planning to release it in this autumn (Oct. – Nov.)  It will breaks the backward compatibility Planned features  Separate CuPy into a separate repository/package  Plugin system and service (we need discussions with you!) (share Functions/Links/etc. easily with other users without sending PRs)  Backprop as a graph (or gradient of expressions including gradients)  Parameter shape inference (without specifying “input size”)  Make asynchronous CPU/GPU transfer by default  Parameter annotation 17

×