2. Objective
● A Scientific library is memory intensive and speed counts. How to use JRuby
effectively to create a great tool/gem?
● A General Purpose GPU library for Ruby that can be used by industry in
production and academia for research.
3. ● Ruby Science Foundation
● SciRuby has been trying to push Ruby for scientific computing.
● Popular Rubygems:
1. NMatrix
2. Daru
3. Mixed_models
4. NMatrix
● NMatrix is SciRuby’s numerical matrix core, implementing dense matrices as
well as two types of sparse (linked-list-based and Yale/CSR).
● It currently relies on ATLAS/CBLAS/CLAPACK and standard LAPACK for
several of its linear algebra operations.
9. SciRuby vs SciPy
● We love Ruby.
● We love Rails.
● Expressiveness of Ruby.
10. ● Known for performance JRuby is 10 times faster than CRuby.
● With truffle it’s around 40 times faster than CRuby. Truffle is supported by
Oracle.
12. NMatrix for JRuby
● Parallelism=> No Global Interpreter Lock as in case of MRI
● Easy Deployment(Warbler gem)
● Auto Garbage collection.
● Speed
● NMatrix for JRuby relies on Apache Commons Math
13. MDArray
● Not a unified interface for Sciruby gems=> Why not build a wrapper around
MDArray ?
● MDArray is a great gem for Linear Algebra.
● MdArray used Parallel colt that was depreceated.
● However, every gem that used NMatrix as dependency needed to be
reimplemented with MDArray.
26. 2 - dimensional Matrix Operations
● [:dot, :det, :factorize_lu]
● In NMatrix-MRI, BLAS-III and LAPACK routines are implemented using their
respective libraries.
● NMatrix-JRuby depends on Java functions.
27. Challenges
● Converting a 1-D array to 2-D array
● Array Size and Accessing elements
● Speed and Memory Required
29. Ruby Code
index =0
puts Benchmark.measure{
(0...15000).each do |i|
(0...15000).each do |j|
c[i][j] = b[i][j]
index+=1
end
end
}
#67.790000 0.070000 67.860000 ( 65.126546)
#RAM consumed => 5.4GB
b = Java::double[15_000,15_000].new
c = Java::double[15_000,15_000].new
index=0
puts Benchmark.measure{
(0...15000).each do |i|
(0...15000).each do |j|
b[i][j] = index
index+=1
end
end
}
#43.260000 3.250000 46.510000 ( 39.606356)
31. Java Code
public class MatrixGenerator{
public static void test2(){
for (int index=0, i=0; i < row ; i++){
for (int j=0; j < col; j++){
c[i][j]= b[i][j];
index++;
}
}
}
puts Benchmark.measure{MatrixGenerator.test2}
#0.034000 0.001000 00.034000 ( 00.03300)
#RAM consumed => 300MB
public class MatrixGenerator{
public static void test1(){
double[][] b = new double[15000][15000];
double[][] c = new double[15000][15000];
for (int index=0, i=0; i < row ; i++){
for (int j=0; j < col; j++){
b[i][j]= index;
index++;
}
}
}
puts Benchmark.measure{MatrixGenerator.test1}
#0.032000 0.001000 00.032000 ( 00.03100)
42. Benchmark conclusion
● NMatrix-JRuby is incredibly faster for N-dimensional matrices when
elementwise operations are concerned.
● NMatrix-MRI is faster for 2-dimensional matrix when calculating matrix
multiplication, determinant calculation and factorization.
49. A General-Purpose GPU library
● Combine the beauty of Ruby with transparent GPU processing
● This will work both on client computers and on servers that make use of
TESLA's and Intel Xeon Phi solutions.
● Developer activity and support for the current projects is mixed at best, and
they are tough to use as they involve writing kernels and require a lot of effort
to be put in buffer/RAM optimisation.
51. ArrayFire
● ArrayFire is an open-source GPGPU library written in C++ and uses JIT.
● ArrayFire supports CUDA-capable NVIDIA GPUs, OpenCL devices, and a C-
programming backend.
● It abstracts away from the difficult task of writing kernels for multiple
architectures; handling memory management, and performing tuning and
optimisation.
53. MRI
● C extension
● Architecture is inspired by NMatrix and NArray
● The C++ function is placed in a namespace (e.g., namespace af { }) or is
declared static if possible. The C function receives the prefix af_, e.g.,
arf_multiply() (this function also happens to be static).
● C macros are capitalized and generally have the prefix ARF_, as with
ARF_DTYPE().
● C functions (and macros, for consistency) are placed within extern "C" { }
blocks to turn off C++ mangling.
● C macros (in extern blocks) may represent C++ constants (which are always
56. JRuby
● The approach is same as NMatrix JRuby.
● Java Native Interface( JNI )
● Work on ArrayFire-Java.
57. ● Place 'libaf.so' in the Load path.
require 'ext/vendor/ArrayFire.jar'
class Af_Array
attr_accessor :dims, :elements
def matmul(other)
Blas.matmul(self.arr, other)
end
end