Successfully reported this slideshow.
Upcoming SlideShare
×

of

12

Share

What's wrong with this Julia?

サンプルコード等
https://github.com/bicycle1885/JuliaTokyo3LT/tree/master

See all

See all

What's wrong with this Julia?

1. 1. What's wrong with this Kenta Sato (佐藤建太) Twitter/GitHub: @bicycle1885 1 / 15
2. 2. A Sad Story Is Julia as fast as C/C++? Where is the download link!? Oh, the syntax looks really neat! Ok, it works. Now let's compare it with my Python script. Huh? not so fast as he said, maybe comparable with Python? Julia? Yes, I tried it last month but blahblah... 2 / 15
3. 3. A Sad Story Is Julia as fast as C/C++? Where is the download link!? Oh, the syntax looks really neat! Ok, it works. Now let's compare it with my Python script. Huh? not so fast as he said, maybe comparable with Python? Julia? Yes, I tried it last month but blahblah... Prob(their code does something wrong | newbies complain about performance) > 0.95 We are going to learn very basic skills to write (nearly) optimal Julia code from three cases. 3 / 15
4. 4. Case 1 Is Julia slower than Python? Really?? s=0 n=50000000 foriin1:n s+=i end println(s) Julia \$timejuliacase1.jl real 0m9.859s ... s=0 n=50000000 foriinxrange(1,n+1): s+=i print(s) Python \$timepythoncase1.py real 0m5.998s ... 4 / 15
5. 5. Case 1 Do Not Write a Heavy Loop at the Top Level! The top-level code is interpreted, not compiled. Top level is not intended to run numeric computation. Should be written in a function/ letblock. s=0 n=50000000 foriin1:n s+=i end println(s) Julia \$timejuliacase1.jl real 0m9.859s ... let s=0 n=50000000 foriin1:n s+=i end println(s) end Julia \$timejuliacase1.fix.jl real 0m0.494s ... 5 / 15
6. 6. Case 1 Do Not Write a Heavy Loop at the Top Level! The top-level code is interpreted, not compiled. Top level is not intended to run numeric computation. Should be written in a function/ letblock. 9.859s / 0.494s = 20x faster 6 / 15
7. 7. Case 2 functiondot(x,y) s=0 foriin1:length(x) s+=x[i]*y[i] end s end n=10000000 x=zeros(n) y=zeros(n) dot(x[1:3],y[1:3]) #compile Julia \$julia-Lcase2.jl-e'@timedot(x,y)' elapsedtime:0.814667538seconds( defdot(x,y): s=0 foriinxrange(len(x)): s+=x[i]*y[i] returns n=10000000 x=[0]*n y=[0]*n Python %timeitdot(x,y) 1loops,bestof3:1.1sperloop 7 / 15
8. 8. Case 2 Make Variable Types Stable! Literal 0is typed as Int. Hence Julia assumes the type of the variable sis Int. The element type of xand yis Float64. In Julia, Int+Float64*Float64is Float64. s+=x[i]*y[i]disturbs the type stability of s. functiondot(x,y) s=0 foriin1:length(x) s+=x[i]*y[i] end s end ... Julia functiondot(x,y) s=zero(eltype(x)) foriin1:length(x) s+=x[i]*y[i] end s end ... Julia 8 / 15
9. 9. Case 2 Type stability of variables dramatically improves the performance. Much less memory allocation and GC time 0.814667538s / 0.017035491s = 48x faster \$julia-Lcase2.fix.jl-e'@timedot(x,y)' elapsedtime:0.017035491seconds(13896bytesallocated) \$julia-Lcase2.jl-e'@timedot(x,y)' elapsedtime:0.814667538seconds(320013880bytesallocated,17.52 9 / 15
10. 10. Case 2 The emitted LLVM code is cleaner when types are stable. Type-unstable code Type-stable code 10 / 15
11. 11. Case 3 ... functionoptimize(x0,n_iter) η=0.1 x=copy(x0) g=zeros(length(x0)) foriin1:n_iter grad!(g,x) x-=η*g end x end Julia ... defoptimize(x0,n_iter): eta=0.1 x=np.copy(x0) g=np.zeros(len(x0)) foriinxrange(n_iter): set_grad(g,x) x-=eta*g returnx Python 11 / 15
12. 12. Case 3 -=for array types is not an in-place operator! x-=η*gworks like: y=η*g: product z=x-y: subtract x=z: bind For each iteration: Create two arrays ( yand z) and Discard two arrays ( xand y) Lots of memories are wasted! 53% of time is consumed in GC! \$julia-Lcase3.jl-e'@timeoptimize(x0,50000)' elapsedtime:0.527045846seconds (812429992bytesallocated,53.60%gctime) 12 / 15
13. 13. Case 3 ... functionoptimize(x0,n_iter) ... foriin1:n_iter grad!(g,x) x-=η*g end x end Julia ... functionoptimize(x0,n_iter) ... foriin1:n_iter grad!(g,x) forjin1:n x[j]-=η*g[j] end end x end Julia 13 / 15
14. 14. Case 3 ... functionoptimize(x0,n_iter) ... foriin1:n_iter grad!(g,x) forjin1:n x[j]-=η*g[j] end end x end Julia ... functionoptimize(x0,n_iter) ... foriin1:n_iter grad!(g,x) BLAS.scal!(n,η,g,1) end x end Julia 14 / 15
15. 15. Take­home Points Julia is definitely fast unless you do it wrong. You should know what code is smooth in Julia. Manual/Performance Tips is a definitive guide to know the smoothness of Julia. Utilize standard tools to analyze bottlenecks: @timemacro @profilemacro --track-allocationoption @code_typed/ @code_lowered/ @code_llvm/ @code_native macros (from higher level to lower) @code_warntype(v0.4) 15 / 15
• mikepham12

Aug. 30, 2018
• SmrutiranjanSahu

Feb. 10, 2017
• AlexToldaiev

Jun. 11, 2016
• takumahr

Jul. 27, 2015
• Mary522

Jul. 27, 2015
• HaoCheng5

Jul. 17, 2015
• ssuser2c7d86

Jun. 26, 2015
• dhisign

Jun. 15, 2015
• HisashiSuga

May. 31, 2015
• abenben

Apr. 29, 2015
• umejam

Apr. 26, 2015
• KanSakamoto

Apr. 25, 2015

サンプルコード等 https://github.com/bicycle1885/JuliaTokyo3LT/tree/master

Total views

6,019

On Slideshare

0

From embeds

0

Number of embeds

2,960

31

Shares

0