Introduction to Julia
Julia Taiwan發起人 杜岳華
1
2
3
Why Julia?
4
In scientific computing and data science…
5
Other users
6
Avoid two language problem
7
Rapid development Performance
itertools的效能
 一篇文章描述兩者的取捨
 「一般來說,我們不會去優化所有的程式碼,因為優化有很
大的代價:一般性與可讀性。 通常跑得快與寫的快,是要做
取捨的。 這裡的例子很好想像,大家只要比較R的程式碼與
Rcpp的程式碼就好了。」
http://wush.ghost.io/itertools-performance/
8
使用Julia就不用做取捨了阿!!
9
Julia的特色
 Write like Python, run like C.
 擁有 python 的可讀性 (readibility)
 擁有 C 的效能
 Easy to parallelism
 內建套件管理器
 ……
10
Julia code
a = [1, 2, 3, 4, 5]
function square(x)
return x^2
end
for x in a
println(square(x))
end
11
https://julialang.org/benchmarks/
Julia performance
12
Who use Julia?
13
 Nobel prize in economic sciences
 The founder of QuantEcon
 “His team at NYU uses Julia for macroeconomic modeling and contributes
to the Julia ecosystem.”
https://juliacomputing.com/case-studies/thomas-sargent.html
14
 In 2015, economists at the Federal Reserve Bank of New York (FRBNY)
published FRBNY’s most comprehensive and complex macroeconomic
models, known as Dynamic Stochastic General Equilibrium, or DSGE
models, in Julia.
https://juliacomputing.com/case-studies/ny-fed.html
15
 UK cancer researchers turned to Julia to run simulations of tumor growth.
Nature Genetics, 2016
 Approximate Bayesian Computation (ABC) algorithms require potentially millions of
simulations - must be fast
 BioJulia project for analyzing biological data in Julia
 Bayesian MCMC methods Lora.jl and Mamba.jl
https://juliacomputing.com/case-studies/nature.html
16
 IBM and Julia Computing analyzed eye fundus images provided by Drishti
Eye Hospitals.
 Timely screening for changes in the retina can help get them to treatment
and prevent vision loss. Julia Computing’s work using deep learning
makes retinal screening an activity that can be performed by a trained
technician using a low cost fundus camera.
https://juliacomputing.com/case-studies/ibm.html
17
 Path BioAnalytics is a computational biotech company developing novel
precision medicine assays to support drug discovery and development,
and treatment of disease.
https://juliacomputing.com/case-studies/pathbio.html
18
 The Sloan Digital Sky Survey contains nearly 5 million telescopic images of
12 megabytes each – a dataset of 55 terabytes.
 In order to analyze this massive dataset, researchers at UC Berkeley and
Lawrence Berkeley National Laboratory created a new code named
Celeste.
https://juliacomputing.com/case-studies/intel-astro.html
19
http://pkg.julialang.org/pulse.html
Julia Package Ecosystem Pulse
20
21
22
Julia CI/CD
23
24
25
26
27
28
29
30
31
32
33
34
IDE
35
Juno
36
37
Supported IDEs
38
VimEmacsVscodeSublime
Introduction to Julia
39
一切都從數字開始…
 在Julia中數字有下列幾種形式
 整數
 浮點數
 有理數
 複數
40
Julia的整數跟浮點數是有不同位元版本的
Integer
Int8
Int16
Int32
Int64
Int128
Unsigned
Uint8
Uint16
Uint32
Uint64
Uint128
Float
Float16
Float32
Float64
41
有理數
 有理數表示
 自動約分
 自動調整負號
 接受分母為0
2//3 # 2//3
-6//12 # -1//2
5//-20 # -1//4
5//0 # 1//0
num(2//10) # 1
den(7//14) # 2
2//4 + 1//7 # 9//14
3//10 * 6//9 # 1//5
10//15 == 8//12 # true
float(3//4) # 0.7542
複數
1 + 2im
(1 + 2im) + (3 - 4im) # 4 - 2im
(1 + 2im)*(3 - 4im) # 11 + 2im
(-4 + 3im)^(2 + 1im) # 1.950 + 0.651im
real(1 + 2im) # 1
imag(3 + 4im) # 4
conj(1 + 2im) # 1 - 2im
abs(3 + 4im) # 5.0
angle(3 + 3im)/pi*180 # 45.0
43
我們來宣告變數吧!
 指定或不指定型別
x = 5
y = 4::Int64
z = x + y
println(z) # 9
44
變數可以很隨便
 動態型別語言特性
 Value is immutable
x = 5
println(x) # 5
println(typeof(x)) # Int64
x = 6.0
println(x) # 6.0
println(typeof(x)) # Float64
45
x
6.0
5
46
靜態型別與動態型別
 靜態型別跟動態型別最大的差別在於型別是跟著變數還是值。
5
5
x
x
47
躺著玩、坐著玩、趴著玩,還是運算子好
玩
 +x: 就是x本身
 -x: 變號
 x + y, x - y, x * y, x / y: 一般四則運算
 div(x, y): 商
 x % y: 餘數,也可以用rem(x, y)
 x  y: 反除,等價於y / x
 x ^ y: 次方
48
操縱數字的機械核心
 ~x: bitwise not
 x & y: bitwise and
 x | y: bitwise or
 x $ y: bitwise xor
 x >>> y:無正負號,將x的位元右移y個位數
 x >> y:保留正負號,將x的位元右移y個位數
 x << y: 將x的位元左移y個位數
https://www.technologyuk.net/mathematics/number-systems/images/binary_number.gif
49
方便的更新方法
 +=
 -=
 *=
 /=
 =
 %=
 ^=
 &=
 |=
 $=
 >>>=
 >>=
 <<=
x += 5
等價於
x = x + 5
50
超級比一比
 x == y:等於
 x != y, x ≠ y:不等於
 x < y:小於
 x > y:大於
 x <= y, x ≤ y:小於或等於
 x >= y, x ≥ y:大於或等於
a, b, c = (1, 3, 5)
a < b < c # true
51
不同型別的運算與轉換
 算術運算會自動轉換
 強型別
3.14 * 4 # 12.56
parse(“5”) # 5
convert(AbstractString, 5) # “5”
52
強型別與弱型別
5 “5”
5 “5”
+
+
Implicitly
53
感覺這樣有點乾
 我們來寫個小遊戲好了
54
來寫個猜拳遊戲好了
paper = 1 # 這代表布
scissor = 2 # 這代表剪刀
stone = 3 # 這代表石頭
55
判斷輸贏
 If判斷式
 短路邏輯
if scissor > paper
println("scissor win!!")
end
if <判斷式>
<程式碼>
end
if 3 > 5 && 10 > 0
…
end 56
使用者輸入
println("請輸入要出的拳”)
println(“1代表布,2代表剪刀,3代表石頭:")
s = readline(STDIN)
x = parse(s)
57
組織起來
if x == paper
println("你出布")
elseif x == scissor
println("你出剪刀")
elseif x == stone
println("你出石頭")
end
if <判斷式1>
<程式碼1>
elseif <判斷式2>
<程式碼2>
else
<程式碼3>
end
58
電腦怎麼出拳
 rand(): 隨機0~1
 rand([]): 從裡面選一個出來
y = rand([1, 2, 3])
59
巢狀比較
if x == y
println("平手")
elseif x == paper
println("你出布")
if y == scissor
println("電腦出剪刀")
println("電腦贏了")
elseif y == stone
println("電腦出石頭")
println("你贏了")
end
... 60
我的義大利麵條
elseif x == scissor
println("你出剪刀")
if y == paper
println("電腦出布")
println("你贏了")
elseif y == stone
println("電腦出石頭")
println("電腦贏了")
endelseif x == stone
println("你出石頭")
if y == scissor
println("電腦出剪刀")
println("你贏了")
elseif y == paper
println("電腦出布")
println("電腦贏了")
end
end
if x == y
println("平手")
elseif x == paper
println("你出布")
if y == scissor
println("電腦出剪刀")
println("電腦贏了")
elseif y == stone
println("電腦出石頭")
println("你贏了")
end 61
我看到重複了
 函式是消除重複的好工具!
 像我們之前有寫了非常多的條件判斷,其實重複性很高,感
覺很蠢,我們可以設法把出拳的判斷獨立出來。
62
函式來幫忙
function add(a, b)
c = a + b
return c
end
63
函式怎麼講話
 pass-by-sharing
5x
function foo(a)
end
a
64
簡化重複
function shape(x)
if x == paper
return "布"
elseif x == scissor
return "剪刀"
elseif x == stone
return "石頭"
end
end
65
要怎麼處理判定輸贏?
 簡化了重複
 可是沒有處理判定輸贏
66
你需要的是一個矩陣
 突然神說了一句話,解救了凡人的我。XD
 是的,或許你需要一個表來讓你查。
| 布 剪刀 石頭
-------------------
布| 0 -1 1
剪刀| 1 0 -1
石頭| -1 1 0
67
介紹Array
 homogenous
 start from 1
 mutable
[ ]2 3 5
A = [2, 3, 5]
A[2] # 3
68
多維陣列
A = [0, -1, 1;
1, 0, -1;
-1, 1, 0]
A[1, 2]
69
字串的簡易操作
 concatenate
 x要是字串
"你出" * x
70
簡化完畢
 稱為重構
 refactoring
x_shape = shape(x)
y_shape = shape(y)
println("你出" * x_shape)
println("電腦出" * y_shape)
win_or_lose = A[x, y]
if win_or_lose == 0
println("平手")
elseif win_or_lose == 1
println("你贏了")
else
println("電腦贏了")
end
71
我想玩很多次
while <判斷式>
<程式碼>
end
x = …
while <持續條件>
...
x = …
end
72
停止條件
s = readline(STDIN)
x = parse(s)
while x != -1
...
s = readline(STDIN)
x = parse(s)
end
73
Julia其他常用語法
 For loop
 Comprehension
 Collections
74
For loop
for i = 1:5 # for迴圈,有限的迴圈次數
println(i)
end
75
Array搭配for loop
strings = ["foo","bar","baz"]
for s in strings
println(s)
end
76
數值運算
 介紹各種Array函式
zeros(Float64, 2, 2) # 2-by-2 matrix with 0
ones(Float64, 3, 3) # 3-by-3 matrix with 1
trues(2, 2) # 2-by-2 matrix with true
eye(3) # 3-by-3 diagnal matrix
rand(2, 2) # 2-by-2 matrix with random number
77
Comprehension
[x for x = 1:3]
[x for x = 1:20 if x % 2 == 0]
["$x * $y = $(x*y)" for x=1:9, y=1:9]
[1, 2, 3]
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
[“1 * 1 = 1“, “1 * 2 = 2“, “1 * 3 = 3“ ...]
78
Tuple
 Immutable
tup = (1, 2, 3)
tup[1] # 1
tup[1:2] # (1, 2)
(a, b, c) = (1, 2, 3)
79
Set
 Mutable
filled = Set([1, 2, 2, 3, 4])
push!(filled, 5)
intersect(filled, other)
union(filled, other)
setdiff(Set([1, 2, 3, 4]), Set([2, 3, 5]))
Set([i for i=1:10])
80
Dict
 Mutable
filled = Dict("one"=> 1, "two"=> 2, "three"=> 3)
keys(filled)
values(filled)
Dict(x=> i for (i, x) in enumerate(["one", "two",
"three", "four"]))
81
Julia special features
82
支援UTF8符號
 打`alpha<tab>` => α
 α = 1 # 作為變數名稱
 μ = 0
 σ = 1
 normal = Normal(μ, σ)
83
Easy to optimize
 Allow generalization and flexibility, and enable to optimize.
 Hints:
 Avoid global variables
 Add type declarations
 Measure performance with @time and pay attention to memory
allocation
 ……
84
Easy to profile
 Use @time
 ProfileView.view()
85
Easy to parallelize
for i = 1:100000
do_something()
end
@parallel for i = 1:100000
do_something()
end
86
Package manager
julia> Pkg.update()
julia> Pkg.add(“Foo”)
julia> Pkg.rm(“Foo”)
87
@code_native
julia> @code_native add(1, 2)
.text
Filename: REPL[2]
pushq %rbp
movq %rsp, %rbp
Source line: 2
leaq (%rcx,%rdx), %rax
popq %rbp
retq
nopw (%rax,%rax)
function add(a, b)
return a+b
end
88
@code_llvm
julia> @code_llvm add(1, 2.0)
; Function Attrs: uwtable
define double @julia_add_71636(i64, double) #0 {
top:
%2 = sitofp i64 %0 to double
%3 = fadd double %2, %1
ret double %3
}
function add(a, b)
return a+b
end
89
Type system
90
Type system
 Use type, not class
 Define methods out of type
 Multiple dispatch on types
 Type hierarchy
 Traits for method interface
91
Use type, not class
 Type!
struct Dog
name::String
color::String
end
dog = Dog(“Tom”, “brown”)
Name: Tom
Color: brown
Define methods out of type
function color(a::Animal)
return a.color
end
function voice(d::Dog)
return "bark"
end
function voice(c::Cat)
return "meow"
end
Multiple dispatch on types
function double(obj::Foo, x)
return 2*x
end
function double(obj::Bar, x)
return string(x)*2
end
double
double
args
(obj::Foo, x::Any)
(obj::Bar, x::Any)
Type Hierarchy
Ref:https://en.wikibooks.org/wiki/Introducing_Julia/Types
Traits for method interface
 Traits define a set of functions
 Implement a trait with types
 Independent of type hierarchy
96
https://github.com/mauro3/SimpleTraits.jl
Data science in Julia
97
98
99
100
101
102
DataFrames.jl
julia> using DataFrames
julia> dt = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
4×2 DataFrames.DataFrame
│ Row │ A │ B │
├─────┼───┼───┤
│ 1 │ 1 │ M │
│ 2 │ 2 │ F │
│ 3 │ 3 │ F │
│ 4 │ 4 │ M │
103
DataFrames.jl
julia> dt[:A]
4-element NullableArrays.NullableArray{Int64,1}:
1
2
3
4
julia> dt[2, :A]
Nullable{Int64}(2)
104
DataFrames.jl
julia> dt = readtable("data.csv")
julia> dt = DataFrame(A = 1:10);
julia> writetable("output.csv", dt)
105
DataFrames.jl
julia> names = DataFrame(ID = [1, 2], Name = ["John
Doe", "Jane Doe"])
julia> jobs = DataFrame(ID = [1, 2], Job = ["Lawyer",
"Doctor"])
julia> full = join(names, jobs, on = :ID)
2×3 DataFrames.DataFrame
│ Row │ ID │ Name │ Job │
├─────┼────┼──────────┼────────┤
│ 1 │ 1 │ John Doe │ Lawyer │
│ 2 │ 2 │ Jane Doe │ Doctor │ 106
Query.jl
julia> q1 = @from i in dt begin
@where i.age > 40
@select {number_of_children=i.children, i.name}
@collect DataTable
end
107
StatsBase.jl
 Mean Functions
 mean(x, w)
 geomean(x)
 harmmean(x)
 Scalar Statistics
 var(x, wv[; mean=...])
 std(x, wv[; mean=...])
 mean_and_var(x[, wv][, dim])
 mean_and_std(x[, wv][, dim])
 zscore(X, μ, σ)
 entropy(p)
 crossentropy(p, q)
 kldivergence(p, q)
 percentile(x, p)
 nquantile(x, n)
 quantile(x)
 median(x, w)
 mode(x)
108
StatsBase.jl
 Sampling from Population
 sample(a)
 Correlation Analysis of Signals
 autocov(x, lags[; demean=true])
 autocor(x, lags[; demean=true])
 corspearman(x, y)
 corkendall(x, y)
109
Distributions.jl
 Continuous Distributions
 Beta(α, β)
 Chisq(ν)
 Exponential(θ)
 Gamma(α, θ)
 LogNormal(μ, σ)
 Normal(μ, σ)
 Uniform(a, b)
 Discrete Distributions
 Bernoulli(p)
 Binomial(n, p)
 DiscreteUniform(a, b)
 Geometric(p)
 Hypergeometric(s, f, n)
 NegativeBinomial(r, p)
 Poisson(λ)
110
GLM.jl
111
julia> data = DataFrame(X=[1,2,3], Y=[2,4,7])
3x2 DataFrame
|-------|---|---|
| Row # | X | Y |
| 1 | 1 | 2 |
| 2 | 2 | 4 |
| 3 | 3 | 7 |
GLM.jl
112
julia> OLS = glm(@formula(Y ~ X), data, Normal(),
IdentityLink())
DataFrameRegressionModel{GeneralizedLinearModel,Float64
}:
Coefficients:
Estimate Std.Error z value Pr(>|z|)
(Intercept) -0.666667 0.62361 -1.06904 0.2850
X 2.5 0.288675 8.66025 <1e-17
GLM.jl
113
julia> newX = DataFrame(X=[2,3,4]);
julia> predict(OLS, newX, :confint)
3×3 Array{Float64,2}:
4.33333 1.33845 7.32821
6.83333 2.09801 11.5687
9.33333 1.40962 17.257
# The columns of the matrix are prediction, 95% lower
and upper confidence bounds
Gadfly.jl
114
Plots.jl
115
# initialize the attractor
n = 1500
dt = 0.02
σ, ρ, β = 10., 28., 8/3
x, y, z = 1., 1., 1.
# initialize a 3D plot with 1 empty series
plt = path3d(1, xlim=(-25,25), ylim=(-25,25), zlim=(0,50), xlab =
"x", ylab = "y", zlab = "z", title = "Lorenz Attractor", marker = 1)
# build an animated gif, saving every 10th frame
@gif for i=1:n
dx = σ*(y - x) ; x += dt * dx
dy = x*(ρ - z) - y ; y += dt * dy
dz = x*y - β*z ; z += dt * dz
push!(plt, x, y, z)
end every 10
Data
 JuliaData
 DataFrames.jl
 CSV.jl
 DataStreams.jl
 CategoricalArrays.jl
 JuliaDB
116
Machine learning in Julia
117
118
https://julialang.org/blog/2017/12/ml&pl-zh_tw
Flux.jl
 100% pure Julia
 Automatic Differentiation
 High-level abstraction and low-level API
 Integration with Julia smoothly
 CUDA supported
119
Knet.jl
 100% pure Julia
 Automatic Differentiation
 Low-level API
 Integration with Julia smoothly
120
Turing.jl
 Universal probabilistic programming with an intuitive
modelling interface
 Hamiltonian Monte Carlo (HMC) sampling
 Gibbs sampling that combines particle MCMC, HMC and
many other MCMC algorithms
121
Learn.jl
 General abstractions and algorithms for modeling and
optimization
 Implementations of common models
 Tools for working with datasets
122
Others
 TensorFlow.jl
 MXNet.jl
 Mocha.jl
 Klara.jl: MCMC inference in Julia
 Mamba.jl: Markov chain Monte Carlo (MCMC) for Bayesian
analysis in julia
123
Science in Julia
124
Differential equation
 JuliaDiff
 ForwardDiff.jl: Forward Mode Automatic Differentiation for Julia
 ReverseDiff.jl: Reverse Mode Automatic Differentiation for Julia
 TaylorSeries.jl
 JuliaDiffEq
 DifferentialEquations.jl
 Discrete Equations (function maps, discrete stochastic (Gillespie/Markov) simulations)
 Ordinary Differential Equations (ODEs)
 Stochastic Differential Equations (SDEs)
 Algebraic Differential Equations (DAEs)
 Delay Differential Equations (DDEs)
 (Stochastic) Partial Differential Equations ((S)PDEs) 125
DifferentialEquations.jl
 地表最強大的微分方程套件!
 比較 MATLAB, R, Julia, Python, C, Mathematica, Maple 及
Fortran 的微分方程套件
126
http://www.stochasticlifestyle.com/co
mparison-differential-equation-solver-
suites-matlab-r-julia-python-c-fortran/
Optimization
 JuliaOpt
 JuMP.jl
 Convex.jl
127
Objective types
•Linear
•Convex Quadratic
•Nonlinear (convex and
nonconvex)
Constraint types
•Linear
•Convex Quadratic
•Second-order Conic
•Semidefinite
•Nonlinear (convex and
nonconvex)
Variable types
•Continuous
•Integer-valued
•Semicontinuous
•Semi-integer
Graph / Network
 JuliaGraphs
 LightGraphs.jl
 GraphPlot.jl
128
Glue language of Julia
129
Glue
 JuliaPy
 JuliaInterop
130
Web stack in Julia
131
Genie – full-stack MVC framework
132
Escher
133
Web
 JuliaWeb
 Requests.jl
 HttpServer.jl
 WebSockets.jl
 HTTPClient.jl
134
HPC from Intel Labs
Announced in JuliaCon 2016
135
136
https://www.slideshare.net/EhsanTotoni/hpat-presentation-at-juliacon-2016
137
https://www.slideshare.net/EhsanTotoni/hpat-presentation-at-juliacon-2016
138
https://www.slideshare.net/EhsanTotoni/hpat-presentation-at-juliacon-2016
HPC from Intel Labs
 Video
 https://www.youtube.com/watch?v=Qa7nfaDacII
 Slide
 https://www.slideshare.net/EhsanTotoni/hpat-presentation-at-juliacon-2016
 Github
 2015: IntelLabs/ParallelAccelerator.jl
 2016: IntelLabs/HPAT.jl
 High Performance Analytics Toolkit (HPAT) is a Julia-based framework for big data
analytics on clusters.
 2018: IntelLabs/Latte.jl
 A high-performance DSL for deep neural networks in Julia
139
140
JuliaCon Sponsors
141
Jobs
 Apple, Amazon, Facebook, BlackRock, Ford, Oracle
 Comcast, Massachusetts General Hospital
 Farmers Insurance
 Los Alamos National Laboratory and the National
Renewable Energy Laboratory
142
https://juliacomputing.com/press/2017/01/18/jobs.html
Julia Taiwan
 社群: https://www.facebook.com/groups/JuliaTaiwan/
 新知發布平台: https://www.facebook.com/juliannewstw/
143
Backup
144
File
 JuliaIO
 FileIO.jl
 JSON.jl
 LightXML.jl
 HDF5.jl
 GZip.jl
145
Programming
 JuliaCollections
 Iterators.jl
 DataStructures.jl
 SortingAlgorithms.jl
 FunctionalCollections.jl
 Combinatorics.jl
146
Type Hierarchy
Type System
 動態的,但擁有一些靜態型別系統的優點
 函式參數不加上型別,參數會被設成Any,加上型別可以增
加效能跟系統的穩健性
 參數化型別可以實現generic type
 型別系統是以階層式(hierarchical)架構建立的,明確描述
型別之間的關係
Type System
Tuple
Union
OOP in Julia

COSCUP: Introduction to Julia

Editor's Notes

  • #15 the next generation of macroeconomic models is very computationally intensive with large datasets and large numbers of variables
  • #16 First, as free software Second, as the models that we use for forecasting and policy analysis grow more complicated, we need a language that can perform computations at a high speed
  • #17 Fast and easy to code