- 1. Scientific Computing on JRuby github.com/prasunanand
- 2. Objective ● A Scientific library is memory intensive and speed counts.How to use JRuby effectively to create a great tool/gem. ● A General Purpose GPU library for Ruby that can be used by industry in production and academia for research.
- 3. ● Ruby Science Foundation ● SciRuby has been trying to push Ruby for scientific computing. ● Popular Rubygems: 1. NMatrix 2. Daru 3. Mixed_models
- 4. NMatrix NMatrix is SciRuby’s numerical matrix core, implementing dense matrices as well as two types of sparse (linked-list-based and Yale/CSR). It currently relies on ATLAS/CBLAS/CLAPACK and standard LAPACK for several of its linear algebra operations.
- 5. Daru
- 6. Mixed_models
- 7. Nyaplot
- 8. Why nya?
- 9. Contributors wanted ● IRC #sciruby ● Slack-channel #sciruby ● Google-group #sciruby
- 10. Known for performance JRuby is 10 times faster than CRuby. With truffle it’s around 40 times faster than CRuby.
- 11. Say hello
- 12. NMatrix for JRuby ● Not a unified interface for Sciruby gems: MDArray. ● MDArray is a great gem for Linear Algebra. ● However, every gem that used NMatrix as dependency needed to be reimplemented with MDArray. ● Hence, putting in effort for optimization.
- 13. NMatrix for JRuby ● Parallelism=> No Global Interpreter Lock as in case of MRI ● Easy Deployment(Warbler gem)
- 14. How NMatrix works ● N-Dimensional ● 2-Dimensional NMatrix
- 15. N-dimensional NMatrix N-dimensional matrices are stored as a one-dimensional Array.
- 16. Elementwise Operation ● Iterate through the elements ● Access the array; do the operation, return it ● [:add, :subtract, :sin, :gamma]
- 17. Determinants and Factoriztion ● Two dimensional matrix operations ● In NMatrix-MRI, BLAS-III and LAPACK routines are implemented using their respective libraries ● NMatrix-JRuby depends on Java functions.
- 18. Mixed models ● After NMAtrix for doubles was ready, I tested it with mixed_models.
- 19. Challenges ● Autoboxing and Multiple data type ● Minimise copying of data ● Handling large array
- 20. Autoboxing ● :float64 => double only ● Strict dtypes => creating data type in Java: not guessing ● Errors => that can’t be reproduced :P [ 0. 11, 0.05, 0.34, 0.14 ] + [ 0. 21,0.05, 0.14, 0.14 ] = [ 0, 0, 0, 0] ([ 0. 11, 0.05, 0.34, 0.14 ] + 5) + ([ 0. 21, 0.05, 0.14, 0.14 ] + 5) - 10 = [ 0.32, 0.1, 0.48, 0.28]
- 21. Minimise copying of data ● Make sure you make copies of data
- 22. Handling large arrays ● Array Size ● Accessing elements ● Chaining to java method ● Speed and Memory Required
- 23. Ruby Code index =0 puts Benchmark.measure{ (0...15000).each do |i| (0...15000).each do |j| c[i][j] = b[i][j] index+=1 end end } #67.790000 0.070000 67.860000 ( 65.126546) #RAM consumed => 5.4GB b = Java::double[15_000,15_000].new c = Java::double[15_000,15_000].new index=0 puts Benchmark.measure{ (0...15000).each do |i| (0...15000).each do |j| b[i][j] = index index+=1 end end } #43.260000 3.250000 46.510000 ( 39.606356)
- 24. Java Code public class MatrixGenerator{ public static void test2(){ for (int index=0, i=0; i < row ; i++){ for (int j=0; j < col; j++){ c[i][j]= b[i][j]; index++; } } } puts Benchmark.measure{MatrixGenerator.test2} #0.034000 0.001000 00.034000 ( 00.03300) #RAM consumed => 300MB public class MatrixGenerator{ public static void test1(){ double[][] b = new double[15000][15000]; double[][] c = new double[15000][15000]; for (int index=0, i=0; i < row ; i++){ for (int j=0; j < col; j++){ b[i][j]= index; index++; } } } puts Benchmark.measure{MatrixGenerator.test1} #0.032000 0.001000 00.032000 ( 00.03100)
- 25. Results Improves: ● 1000 times the speed ● 10times the memory
- 26. Benchmarking NMatrix functionalities
- 27. System Specifications ● CPU: AMD FX8350 0ctacore 4.2GHz ● RAM: 16GB
- 28. Addition
- 29. Subtraction
- 30. Gamma
- 31. Matrix Multiplication
- 32. Determinant
- 33. Factorization
- 34. Benchmark conclusion ● NMatrix-JRuby is incredibly faster for N-dimensional matrices when elementwise operations are concerned. ● NMatrix-MRI is faster for 2-dimensional matrix when calculating matrix multiplication, determinant calculation and factorization.
- 35. Improvements ● Make NMatrix-JRuby faster than NMatrix-MRI using BLAS level-3 and LAPACK routines. ● How? ● Why not JBlas?
- 36. Future Work ● Add support for complex dtype. ● Convert NMatrix-JRuby Enumerators to Java code. ● Add sparse support.
- 37. Am I done?
- 38. Nope!
- 39. Enter GPU
- 40. A General-Purpose GPU library ● Combine the beauty of Ruby with transparent GPU processing ● This will work both on client computers and on servers that make use of TESLA's and Intel Xeon Phi solutions. ● Developer activity and support for the current projects is mixed at best, and they are tough to use as they involve writing kernels and require a lot of effort to be put in buffer/RAM optimisation.
- 41. ArrayFire-rb ● Wraps ArrayFire library
- 42. Using ArrayFire
- 43. MRI ● C extension ● Architecture is inspired by NMatrix and NArray ● The C++ function is placed in a namespace (e.g., namespace af { }) or is declared static if possible. The C function receives the prefix af_, e.g., af_multiply() (this function also happens to be static). ● C macros are capitalized and generally have the prefix AF_, as with AF_DTYPE(). ● C functions (and macros, for consistency) are placed within extern "C" { } blocks to turn off C++ mangling.
- 44. JRuby ● The approach is same as NMatrix JRuby. ● Java Native Interface( JNI ) ● Work on ArrayFire-Java
- 45. Benchmarking ArrayFire
- 46. System Specification CPU: AMD FX Octacore 4.2GHz RAM: 16GB GPU: Nvidia GTX 750Ti GPU RAM : 4GB DDR5
- 47. Matrix Addition
- 48. Matrix Multiplication
- 49. Matrix Determinant
- 50. Factorization
- 51. Transparency ● Integrate with Narray ● Integrate with NMatrix ● Integrate with Rails
- 52. Applications ● Endless possibilities ;) ● Bioinformatics ● Integrate Tensorflow ● Image Processing ● Computational Fluid Dynamics
- 53. Conclusion
- 54. Useful Links ● https://github.com/sciruby/nmatrix ● https://github.com/arrayfire/arrayfire-rb ● https://github.com/prasunanand/arrayfire-rb/tree/temp
- 55. Acknowlegements 1. Pjotr Prins 2. Charles Nutter 3. John Woods 4. Alexej Gossmann 5. Sameer Deshmukh 6. Pradeep Garigipati
- 56. Thank You Github: prasunanand Twitter: @prasun_anand Blog: prasunanand.com

