SlideShare a Scribd company logo
Chainer v3
Chainer Meetup #06 @ PFN, Sep. 30, 2017
Seiya Tokui @ Preferred Networks
Recent/coming releases
• Chainer v3.0.0 RC, v2.1.0: Sep. 12
• v3 RC was the 50th release!
• CuPy v2.0.0 RC, v1.0.3 on the same day
• Next release: Chainer v3.0.0 and v4.0.0α on Oct. 17
• CuPy v2.0.0 and v3.0.0α on the same day
• Today, I mainly talk about the features of CuPy v2.0.0 RC and
Chainer v3.0.0 RC
Chainer v3.0.0rc1
• For most users, the backward compatibility is maintained
• See the release notes of v3.0.0rc1 for some small breaks that do not
affect most users
• The inner-working is greatly changed
• It may cause some existing code that directly touches the
computational graphs broken
• Thanks to this change, we now support double backprop
(a.k.a. gradient of gradients) as announced
Double backprop
• Automatic backpropagation through gradients
• When is it needed?
• Consider a loss function that includes a gradient computation as a
term/factor
• E.g. the loss function for WGAN-GP:
𝔼 𝑥∼ℙ 𝑔
𝐷 𝑥 − 𝔼 𝑥∼ℙ 𝑟
𝐷 𝑥 + 𝜆𝔼 𝑥∼ℙ 𝑥
𝛻𝑥 𝐷 𝑥 2 − 1 2
• To take the gradient of this loss function, we need to do backprop
through 𝛻𝑥 𝐷( 𝑥), which itself we want to compute with backprop!
gradient
Double backprop in Chainer v3
• Many functions now support double backprop
• Those functions are rewritten to implement a new interface named
FunctionNode (such functions are called new-style Functions)
• backward() takes Variable instead of ndarray as grad_outputs
and return values, which means backward() itself can be
differentiated
• Variable has now an attribute grad_var, which represents
the gradient as a Variable (so that we can use it in the
computational graph)
How to implement WGAN-GP
1. Using Variable.backward()
x_tilde = generator(z)
x_hat = x + u * (x_tilde – x)
D(x_hat).backward(enable_double_backprop=True)
# 1st diff
gp = lambda * (x_hat.grad_var – 1) ** 2
loss = D(x_tilde) – D(x) + gp
model.cleargrads() # to clear the 1st diff of params
loss.backward() # 2nd diff
How to implement WGAN-GP
2. Using grad()
x_tilde = generator(z)
x_hat = x + u * (x_tilde – x)
gx_hat, = chainer.grad([D(x_hat)], [x_hat],
enable_double_backprop=True) # 1st diff
gp = lambda * (gx_hat – 1) ** 2
loss = D(x_tilde) – D(x) + gp
loss.backward() # 2nd diff
This version is more efficient because grad() can skip the gradient
computation for parameters (thus also we can drop cleargrads()).
New-style Function support
• Most “standard” functions are now ported to the new-style
interface:
+, -, *, Convolution2D, Deconvolution2D, EmbedID, Linear,
LSTM, BatchNormalization, sigmoid, relu, leaky_relu, softmax,
log_softmax, tanh, exp, mean_squared_error,
softmax_cross_entropy, dropout, layer_normalization,
transpose, reshape, broadcast_to, sum, concat, __getitem__,
etc…
• We are still working on widening the double backprop
support. Contributions are also welcome!!
Other features
• Functions: layer_normalization, selu, arctan2, prod,
NumPy-compatible matmul
• Links: ChildSumTreeLSTM, NaryTreeLSTM,
BatchRenormalization
• Other new features: LeCunNormal, as_variable(),
Variable.array, strict option of load_npz(), etc.
CuPy v2.0.0rc1
• Sparse matrix support
• Complex number support
• Improved memory allocator
• Many new functions, esp. of linear algebra routines
Sparse matrix support
• cupy.sparse --- the sparse matrix support with APIs
compatible to scipy.sparse
• CSR/CSC/COO and diagonal format
• Basic arithmetics, matrix product, element indexing
• Slicing along the major axis
• Dense <-> Sparse conversion
Complex number support
• CuPy now supports complex numbers!
• Dtypes complex32, complex64, complex128 are now available
• Routines related to complex numbers:
angle, conj, imag, real
Linear algebra routines
• Solvers, matrix inversion, determinant, eigenvalues, etc.:
solve, tensorsolve, inv, pinv, det, slogdet, eigh,
eigvalsh, matrix_rank
• All under cupy.linalg namespace
• einsum is also supported (thanks, @fukatani!)
• Flexible tensor product/reduction based on Einstein convention
Improved memory allocator
• The memory pool is greatly improved
• It now uses “best-fit with coalescing” algorithm
• The memory region is reused even if the size does not exactly match
• It may also contribute to the speed improvement, thanks to the
reduced number of reallocations
• Example: the new seq2seq example originally uses all the
memory of 12GB GPU, whose usage is reduced to 3GB, and
also the execution time is reduced by appx. 25%.
Next versions
• As you may know, we slightly changed the release policy
again; the stable releases may now include some new
features (thus v2.1.0 instead of v2.0.3).
• v4 is scheduled based on our release policy: v4.0.0 will be
three months after v3.0.0 (which will be mid Jan. if there is no
delay).
• The core features of v4 is not determined yet; let’s have
discussions!
Chainer v3

More Related Content

What's hot

Chainer v2 alpha
Chainer v2 alphaChainer v2 alpha
Chainer v2 alpha
Seiya Tokui
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
Shunta Saito
 
GTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introductionGTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introduction
Kenta Oono
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
Preferred Networks
 
Chainer v2 and future dev plan
Chainer v2 and future dev planChainer v2 and future dev plan
Chainer v2 and future dev plan
Seiya Tokui
 
CuPy v4 and v5 roadmap
CuPy v4 and v5 roadmapCuPy v4 and v5 roadmap
CuPy v4 and v5 roadmap
Preferred Networks
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」
Preferred Networks
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
Shohei Hido
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
Preferred Networks
 
Automatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPyAutomatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPy
Preferred Networks
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorch
Mayur Bhangale
 
Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep Learning
Seiya Tokui
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learning
Amgad Muhammad
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017
Yu-Hsun (lymanblue) Lin
 
PyTorch crash course
PyTorch crash coursePyTorch crash course
PyTorch crash course
Nader Karimi
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
Kenta Oono
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
Yu-Hsun (lymanblue) Lin
 
第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)
RCCSRENKEI
 

What's hot (20)

Chainer v2 alpha
Chainer v2 alphaChainer v2 alpha
Chainer v2 alpha
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
GTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introductionGTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introduction
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Chainer v2 and future dev plan
Chainer v2 and future dev planChainer v2 and future dev plan
Chainer v2 and future dev plan
 
CuPy v4 and v5 roadmap
CuPy v4 and v5 roadmapCuPy v4 and v5 roadmap
CuPy v4 and v5 roadmap
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
 
Automatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPyAutomatically Fusing Functions on CuPy
Automatically Fusing Functions on CuPy
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorch
 
Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep Learning
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learning
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017
 
PyTorch crash course
PyTorch crash coursePyTorch crash course
PyTorch crash course
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
 
第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)
 

Viewers also liked

「ChainerCVとOpenCVではじめる物体検出」のための事前準備
「ChainerCVとOpenCVではじめる物体検出」のための事前準備「ChainerCVとOpenCVではじめる物体検出」のための事前準備
「ChainerCVとOpenCVではじめる物体検出」のための事前準備
shinozaki_takashi
 
[Dl輪読会]video pixel networks
[Dl輪読会]video pixel networks[Dl輪読会]video pixel networks
[Dl輪読会]video pixel networks
Deep Learning JP
 
Variational AutoEncoder
Variational AutoEncoderVariational AutoEncoder
Variational AutoEncoder
Kazuki Nitta
 
More modern gpu
More modern gpuMore modern gpu
More modern gpu
Preferred Networks
 
Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用
Seiya Tokui
 
A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]
Yuta Kikuchi
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Seiya Tokui
 
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
Toshihiko Yamakami
 
再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会
再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会
再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会
Shotaro Sano
 
最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情
Yuta Kikuchi
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
Seiya Tokui
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep Learning
Seiya Tokui
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習
Preferred Networks
 
猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder
Sho Tatsuno
 

Viewers also liked (14)

「ChainerCVとOpenCVではじめる物体検出」のための事前準備
「ChainerCVとOpenCVではじめる物体検出」のための事前準備「ChainerCVとOpenCVではじめる物体検出」のための事前準備
「ChainerCVとOpenCVではじめる物体検出」のための事前準備
 
[Dl輪読会]video pixel networks
[Dl輪読会]video pixel networks[Dl輪読会]video pixel networks
[Dl輪読会]video pixel networks
 
Variational AutoEncoder
Variational AutoEncoderVariational AutoEncoder
Variational AutoEncoder
 
More modern gpu
More modern gpuMore modern gpu
More modern gpu
 
Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用
 
A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
 
再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会
再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会
再帰型ニューラルネット in 機械学習プロフェッショナルシリーズ輪読会
 
最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep Learning
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習
 
猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder
 

Similar to Chainer v3

Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
ScyllaDB
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Jen Aman
 
Large-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookLarge-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at Facebook
Faisal Siddiqi
 
OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard
OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/HardOPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard
OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard
Paul Brebner
 
Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data Meetup
Gwen (Chen) Shapira
 
Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20
Qiming Teng
 
Mining quasi bicliques using giraph
Mining quasi bicliques using giraphMining quasi bicliques using giraph
Mining quasi bicliques using giraph
Hsiao-Fei Liu
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Flink Forward
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLabAdvanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
Mydbops
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
Koan-Sin Tan
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsx
BharathiLakshmiAAssi
 
matrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsxmatrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsx
Bharathi Lakshmi Pon
 
JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011
Kris Mok
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
VMware Tanzu
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical View
Lei (Harry) Zhang
 
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Alexey Zinoviev
 
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Ontico
 
2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group
Nitay Joffe
 

Similar to Chainer v3 (20)

Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
 
Large-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookLarge-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at Facebook
 
OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard
OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/HardOPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard
OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard
 
Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data Meetup
 
Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20
 
Mining quasi bicliques using giraph
Mining quasi bicliques using giraphMining quasi bicliques using giraph
Mining quasi bicliques using giraph
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLabAdvanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsx
 
matrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsxmatrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsx
 
JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical View
 
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
 
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
 
2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group
 

More from Seiya Tokui

Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)
Seiya Tokui
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
Seiya Tokui
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開
Seiya Tokui
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning Frameworks
Seiya Tokui
 
Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Chainer Development Plan 2015/12
Chainer Development Plan 2015/12
Seiya Tokui
 
Towards Chainer v1.5
Towards Chainer v1.5Towards Chainer v1.5
Towards Chainer v1.5
Seiya Tokui
 
Deep Learningの基礎と応用
Deep Learningの基礎と応用Deep Learningの基礎と応用
Deep Learningの基礎と応用
Seiya Tokui
 
論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick
Seiya Tokui
 
深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待
Seiya Tokui
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models
Seiya Tokui
 
Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Deep learning実装の基礎と実践
Deep learning実装の基礎と実践
Seiya Tokui
 
Deep Learning技術の今
Deep Learning技術の今Deep Learning技術の今
Deep Learning技術の今
Seiya Tokui
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
Seiya Tokui
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
Seiya Tokui
 
Deep Learningの技術と未来
Deep Learningの技術と未来Deep Learningの技術と未来
Deep Learningの技術と未来
Seiya Tokui
 
Tprimal agh
Tprimal aghTprimal agh
Tprimal agh
Seiya Tokui
 
rinko2011-agh
rinko2011-aghrinko2011-agh
rinko2011-agh
Seiya Tokui
 
rinko2010
rinko2010rinko2010
rinko2010
Seiya Tokui
 

More from Seiya Tokui (19)

Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開
 
Differences of Deep Learning Frameworks
Differences of Deep Learning FrameworksDifferences of Deep Learning Frameworks
Differences of Deep Learning Frameworks
 
Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Chainer Development Plan 2015/12
Chainer Development Plan 2015/12
 
Towards Chainer v1.5
Towards Chainer v1.5Towards Chainer v1.5
Towards Chainer v1.5
 
Deep Learningの基礎と応用
Deep Learningの基礎と応用Deep Learningの基礎と応用
Deep Learningの基礎と応用
 
論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick
 
深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models
 
Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Deep learning実装の基礎と実践
Deep learning実装の基礎と実践
 
Deep Learning技術の今
Deep Learning技術の今Deep Learning技術の今
Deep Learning技術の今
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
 
Deep Learningの技術と未来
Deep Learningの技術と未来Deep Learningの技術と未来
Deep Learningの技術と未来
 
Tprimal agh
Tprimal aghTprimal agh
Tprimal agh
 
rinko2011-agh
rinko2011-aghrinko2011-agh
rinko2011-agh
 
rinko2010
rinko2010rinko2010
rinko2010
 
Ml4nlp 4 2
Ml4nlp 4 2Ml4nlp 4 2
Ml4nlp 4 2
 

Recently uploaded

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 

Recently uploaded (20)

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 

Chainer v3

  • 1. Chainer v3 Chainer Meetup #06 @ PFN, Sep. 30, 2017 Seiya Tokui @ Preferred Networks
  • 2. Recent/coming releases • Chainer v3.0.0 RC, v2.1.0: Sep. 12 • v3 RC was the 50th release! • CuPy v2.0.0 RC, v1.0.3 on the same day • Next release: Chainer v3.0.0 and v4.0.0α on Oct. 17 • CuPy v2.0.0 and v3.0.0α on the same day • Today, I mainly talk about the features of CuPy v2.0.0 RC and Chainer v3.0.0 RC
  • 3. Chainer v3.0.0rc1 • For most users, the backward compatibility is maintained • See the release notes of v3.0.0rc1 for some small breaks that do not affect most users • The inner-working is greatly changed • It may cause some existing code that directly touches the computational graphs broken • Thanks to this change, we now support double backprop (a.k.a. gradient of gradients) as announced
  • 4. Double backprop • Automatic backpropagation through gradients • When is it needed? • Consider a loss function that includes a gradient computation as a term/factor • E.g. the loss function for WGAN-GP: 𝔼 𝑥∼ℙ 𝑔 𝐷 𝑥 − 𝔼 𝑥∼ℙ 𝑟 𝐷 𝑥 + 𝜆𝔼 𝑥∼ℙ 𝑥 𝛻𝑥 𝐷 𝑥 2 − 1 2 • To take the gradient of this loss function, we need to do backprop through 𝛻𝑥 𝐷( 𝑥), which itself we want to compute with backprop! gradient
  • 5. Double backprop in Chainer v3 • Many functions now support double backprop • Those functions are rewritten to implement a new interface named FunctionNode (such functions are called new-style Functions) • backward() takes Variable instead of ndarray as grad_outputs and return values, which means backward() itself can be differentiated • Variable has now an attribute grad_var, which represents the gradient as a Variable (so that we can use it in the computational graph)
  • 6. How to implement WGAN-GP 1. Using Variable.backward() x_tilde = generator(z) x_hat = x + u * (x_tilde – x) D(x_hat).backward(enable_double_backprop=True) # 1st diff gp = lambda * (x_hat.grad_var – 1) ** 2 loss = D(x_tilde) – D(x) + gp model.cleargrads() # to clear the 1st diff of params loss.backward() # 2nd diff
  • 7. How to implement WGAN-GP 2. Using grad() x_tilde = generator(z) x_hat = x + u * (x_tilde – x) gx_hat, = chainer.grad([D(x_hat)], [x_hat], enable_double_backprop=True) # 1st diff gp = lambda * (gx_hat – 1) ** 2 loss = D(x_tilde) – D(x) + gp loss.backward() # 2nd diff This version is more efficient because grad() can skip the gradient computation for parameters (thus also we can drop cleargrads()).
  • 8. New-style Function support • Most “standard” functions are now ported to the new-style interface: +, -, *, Convolution2D, Deconvolution2D, EmbedID, Linear, LSTM, BatchNormalization, sigmoid, relu, leaky_relu, softmax, log_softmax, tanh, exp, mean_squared_error, softmax_cross_entropy, dropout, layer_normalization, transpose, reshape, broadcast_to, sum, concat, __getitem__, etc… • We are still working on widening the double backprop support. Contributions are also welcome!!
  • 9. Other features • Functions: layer_normalization, selu, arctan2, prod, NumPy-compatible matmul • Links: ChildSumTreeLSTM, NaryTreeLSTM, BatchRenormalization • Other new features: LeCunNormal, as_variable(), Variable.array, strict option of load_npz(), etc.
  • 10. CuPy v2.0.0rc1 • Sparse matrix support • Complex number support • Improved memory allocator • Many new functions, esp. of linear algebra routines
  • 11. Sparse matrix support • cupy.sparse --- the sparse matrix support with APIs compatible to scipy.sparse • CSR/CSC/COO and diagonal format • Basic arithmetics, matrix product, element indexing • Slicing along the major axis • Dense <-> Sparse conversion
  • 12. Complex number support • CuPy now supports complex numbers! • Dtypes complex32, complex64, complex128 are now available • Routines related to complex numbers: angle, conj, imag, real
  • 13. Linear algebra routines • Solvers, matrix inversion, determinant, eigenvalues, etc.: solve, tensorsolve, inv, pinv, det, slogdet, eigh, eigvalsh, matrix_rank • All under cupy.linalg namespace • einsum is also supported (thanks, @fukatani!) • Flexible tensor product/reduction based on Einstein convention
  • 14. Improved memory allocator • The memory pool is greatly improved • It now uses “best-fit with coalescing” algorithm • The memory region is reused even if the size does not exactly match • It may also contribute to the speed improvement, thanks to the reduced number of reallocations • Example: the new seq2seq example originally uses all the memory of 12GB GPU, whose usage is reduced to 3GB, and also the execution time is reduced by appx. 25%.
  • 15. Next versions • As you may know, we slightly changed the release policy again; the stable releases may now include some new features (thus v2.1.0 instead of v2.0.3). • v4 is scheduled based on our release policy: v4.0.0 will be three months after v3.0.0 (which will be mid Jan. if there is no delay). • The core features of v4 is not determined yet; let’s have discussions!