Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Julia for bioinformacis

3,197 views

Published on

理研バイオインフォマティクスセミナー

Published in: Science
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Introduction to Julia for bioinformacis

  1. 1. Introduction to Julia  for Bioinformatics Kenta Sato (佐藤建太) @ Bioinformatics Research Unit, RIKEN ACCC November 19, 2015 1 / 72
  2. 2. Topics About Me Julia BioJulia Julia Updates '15 2 / 72
  3. 3. About Me Graduate school student at the University of Tokyo. About 2-year experience of Julia programming. Contributing to Julia and its ecosystem: https://github.com/docopt/DocOpt.jl https://github.com/bicycle1885/IntArrays.jl https://github.com/BioJulia/IndexableBitVectors.jl https://github.com/BioJulia/WaveletMatrices.jl https://github.com/BioJulia/FMIndexes.jl https://github.com/isagalaev/highlight.js (Julia support) etc. Core developer of BioJulia - https://github.com/BioJulia/Bio.jl Julia Summer of Code 2015 Student - http://julialang.org/blog/2015/10/biojulia-sequence-analysis/ 3 / 72
  4. 4. JuliaCon 2015 at MIT, Boston https://twitter.com/acidflask/status/633349038226690048 4 / 72
  5. 5. Julia is ... Julia is a high­level, high­performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. Julia’s Base library, largely written in Julia itself, also integrates mature, best­of­breed open source C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing. — http://julialang.org/ “ 5 / 72
  6. 6. Two­Language Problem In technical computing, users use easier and slower script languages, while developers use harder and faster compiled languages. 6 / 72
  7. 7. Two­Language Problem Both users and developers can use a handy language without sacrificing performance. 7 / 72
  8. 8. Three Virtues of the Julia Language Simple Fast Dynamic 8 / 72
  9. 9. Simple Syntax with least astonishment no semicolons no variable declarations no argument types Unicode support 1-based index blocks end with end No implicit type conversion Quick sort with 24 lines quicksort(xs)=quicksort!(copy(xs)) quicksort!(xs)=quicksort!(xs,1,endof(xs)) functionquicksort!(xs,lo,hi) iflo<hi p=partition(xs,lo,hi) quicksort!(xs,lo,p-1) quicksort!(xs,p+1,hi) end returnxs end functionpartition(xs,lo,hi) pivot=div(lo+hi,2) pvalue=xs[pivot] xs[pivot],xs[hi]=xs[hi],xs[pivot] j=lo @inboundsforiinlo:hi-1 ifxs[i]≤pvalue xs[i],xs[j]=xs[j],xs[i] j+=1 end end xs[j],xs[hi]=xs[hi],xs[j] returnj end 9 / 72
  10. 10. Fast Comparable performance to compiled languages. http://julialang.org/ 10 / 72
  11. 11. Fast The LLVM-backed JIT compiler emits machine code at runtime. julia>4>>1 #bitwiseright-shiftfunction 2 julia>@code_native4>>1 .section __TEXT,__text,regular,pure_instructions Filename:int.jl Sourceline:115 pushq %rbp movq %rsp,%rbp movl $63,%ecx cmpq $63,%rsi Sourceline:115 cmovbeq%rsi,%rcx sarq %cl,%rdi movq %rdi,%rax popq %rbp ret 11 / 72
  12. 12. Dynamic No need to precompile your program. hello.jl: println("hello,world") Output: $juliahello.jl hello,world In REPL: julia>include("hello.jl") hello,world 12 / 72
  13. 13. Dynamic High-level code generation at runtime (macros). julia>x=5 5 julia>@assertx>0"xshouldbepositive" julia>x=-2 -2 julia>@assertx>0"xshouldbepositive" ERROR:AssertionError:xshouldbepositive julia>macroexpand(:(@assertx>0"xshouldbepositive")) :(ifx>0 nothing else Base.throw(Base.Main.Base.AssertionError("xshouldbepositive" end) 13 / 72
  14. 14. Who Created? Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman Soon the team was building their dream language. MIT, where Bezanson is a graduate student, became an anchor for the project, with much of the work being done within computer scientist and mathematician Alan Edelman’s research group. But development of the language remained completely distributed. “Jeff and I didn’t actually meet until we’d been working on it for over a year, and Viral was in India the entire time,” Karpinski says. “So the whole language was designed over email.” — "Out in the Open: Man Creates One Programming Language to Rule Them All" ­ http://www.wired.com/2014/02/julia/ “ 14 / 72
  15. 15. Why Created? In short, because we are greedy. — "Why We Created Julia" ­ http://julialang.org/blog/2012/02/why­we­created­julia/ “ 15 / 72
  16. 16. Why Created? The creators wanted a language that satisfies: the speed of C with the dynamism of Ruby macros like Lisp mathematical notations like Matlab as usable for general programming as Python as easy for statistics as R as natural for string processing as Perl as powerful for linear algebra as Matlab as good at gluing programs together as the shell 16 / 72
  17. 17. Batteries Included You can start technical computing without installing lots of libraries. Numeric types {8, 16, 32, 64, 128}-bit {signed, unsigned} integers, 16, 32, 64-bit floating point numbers, and arbitrary-precision numbers. Numerical linear algebra matrix multiplication, matrix decomposition/factorization, solver for system of linear equations, and more! sparse matrices Random number generator Mersenne-Twister method accelerated by SIMD 17 / 72
  18. 18. Batteries Included You can start technical computing without installing lots of libraries. Unicode support Perl-compatible regular expressions (PCRE) Parallel computing Dates and times Unit tests Profiler Package manager 18 / 72
  19. 19. Language Design 19 / 72
  20. 20. Literals #Int64 42 10_000_000 #UInt8 0x1f #Float64 3.14 6.022e23 #Bool true false #UnitRange{Int64} 1:100 #ASCIIString "asciistring" #UTF8String "UTF8文字列" #Regex r"^>[^n]+n[ACGTN]+" #Array{Float64,1} #(Vector{Float64}) [1.0,1.1,1.2] #Array{Float64,2} #(Matrix{Float64}) [1.0 1.1; 2.0 2.2] #Tuple{Int,Float64,ASCIIString} (42,3.14,"asciistring") #Dict{ASCIIString,Int64} Dict("one"=>1,"two",=>2) 20 / 72
  21. 21. Functions All function definitions below are equivalent: functionfunc(x,y) returnx+y end functionfunc(x,y) x+y end func(x,y)=returnx+y func(x,y)=x+y Force inlining: @inlinefunc(x,y)=x+y This simple function will be automatically inlined by the compiler.❏ 21 / 72
  22. 22. Functions ­ Arguments Optional arguments: functionincrement(x,by=1) returnx+by end increment(3) #4 increment(3,2) #5 Keyword arguments: functionincrement(x;by=1) returnx+by end increment(3) #4 increment(3,by=2)#5 Variable number of arguments: functionpushback!(list,vals...) forvalinvals push!(list,val) end returnlist end pushback!([]) #[] pushback!([],1) #[1] pushback!([],1,2) #[1,2] Notice semicolon (;) in the argument list above.❏ 22 / 72
  23. 23. Functions ­ Return Values You can return multiple values from a function as a tuple: functiondivrem64(n) returnn>>6,n&0b111111 end And you can receive returned values with multiple assignments: julia>divrem64(1025) (16,1) julia>d,r=divrem64(1025) (16,1) julia>d 16 julia>r 1 23 / 72
  24. 24. Functions ­ Document A document string can be attached to a function definition: """ Thisfunctioncomputesquotientandremainder dividedby64foranon-negativeinteger. """ functiondivrem64(n) returnn>>6,n&0b111111 end In REPL, you can read the attached document with the ?command: help?>divrem64 search:divrem64divrem Thisfunctioncomputesquotientandremainder dividedby64foranon-negativeinteger. 24 / 72
  25. 25. Types Two kinds of types: concrete types: instantiatable abstract types: not instantiatable 25 / 72
  26. 26. Defining Types Abstract type: abstractAbstractFloat<:Real Composite type: #mutable typePoint x::Float64 y::Float64 end #immutable immutablePoint x::Float64 y::Float64 end Bits type: bitstype64Int64<:Signed Type alias: typealiasUIntUInt64 Enum: @enumVotepositivenegative 26 / 72
  27. 27. Parametric Types Types can take type parameters: typePoint{T} x::T y::T end Point: abstract type Point{Int64}: concrete type subtype of Point(Point{Int64}<:Point) all of the members (i.e. xand y) are Int64s typeNucleotideSequence{T<:Nucleotide}<:Sequence data::Vector{UInt64} ... end 27 / 72
  28. 28. Constructors Julia automatically generates default constructors. Point(1,2)creates an object of Point{Int}type. Point(1.0,2.0)creates an object of Point{Float64}type. Point{Float64}(1,2)creates an object of Point{Float64}type. Users can create custom constructors. typePoint{T} x::T y::T end #outerconstructor functionPoint(x) returnPoint(x,x) end p=Point(1) #>Point{Int64}(1,1) 28 / 72
  29. 29. Memory Layout Compact memory layout like C's structs C compatible memory layout You can pass Julia objects to C functions without copy. This is especially important in bioinformatics when defining data structures for efficient algorithms when handling lots of small objects julia>@enumStrandforwardreversebothunknown julia>immutableExon chrom::Int start::Int stop::Int strand::Strand end julia>sizeof(Exon(1,12345,12446,forward)) 32 29 / 72
  30. 30. Multiple Dispatch Combination of all argument types determines a called method. Single dispatch (e.g. Python) The first argument is special and determines a method. Multiple dispatch (e.g. Julia) All arguments are equally responsible to determine a method. classSerializer: defwrite(self,val): ifisinstance(val,int) #... elifisinstance(val,float) #... #... functionwrite(dst::Serializer, val::Int64) #... end functionwrite(dst::Serializer, val::Float64) #... end #... 30 / 72
  31. 31. Multiple Dispatch ­ Example (1) base/char.jl: -(x::Char,y::Char) =Int(x)-Int(y) -(x::Char,y::Integer)=Char(Int32(x)-Int32(y)) +(x::Char,y::Integer)=Char(Int32(x)+Int32(y)) +(x::Integer,y::Char)=y+x julia>'c'-'a' 2 julia>'c'-1 'b' julia>'a'+0x01 'b' julia>0x01+'a' 'b' 31 / 72
  32. 32. Multiple Dispatch ­ Example (2) functionhas{T<:Integer}(range::UnitRange{Int},target::T) returnfirst(range)≤target≤last(range) end functionhas(iter,target) #sameashas(iter::Any,target::Any) forelminiter ifelm==target returntrue end end returnfalse end julia>has(1:10,4) true julia>has(1:10,-2) false julia>has([1,2,3],2) true 32 / 72
  33. 33. Metaprogramming Julia can represent its own program code as a data structure (Expr). Three metaprogramming components in Julia: Macros generate an expression from expressions. Expr↦ Expr Generated functions generate an expression from types. Types↦ Expr Non-standard string literals generate an expression from a string. String↦ Expr 33 / 72
  34. 34. Metaprogramming ­ Macros Generate an expression from expressions. Expr↦ Expr Denoted as @<macroname>. Distinguishable from function calls We've already seen some macros. macroassert(ex) msg=string(ex) :($ex?nothing:throw(AssertionError($msg))) end julia>x=-1 -1 julia>@assertx>1 ERROR:AssertionError:x>1 34 / 72
  35. 35. Metaprogramming ­ Useful Macros (1) @show: print variables, useful for debug: julia>x=-1 -1 julia>@showx x=-1 @inbounds: omit to check bounds: @inboundsh[i,j]=h[i-1,j-1]+submat[a[i],b[j]] @which: return which function will be called: julia>@whichmax(1,2) max{T<:Real}(x::T<:Real,y::T<:Real)atpromotion.jl:239 35 / 72
  36. 36. Metaprogramming ­ Useful Macros (2) @time: measure elapsed time to evaluate the expression: julia>xs=rand(1_000_000); julia>@timesum(xs) 0.022633seconds(27.24kallocations:1.155MB) 499795.2805424741 julia>@timesum(xs) 0.000574seconds(5allocations:176bytes) 499795.2805424741 @profile: profile the expression: julia>sort(xs);@profilesort(xs); julia>Profile.print() 69REPL.jl;anonymous;line:92 68REPL.jl;eval_user_input;line:62 ... 36 / 72
  37. 37. Generated Functions Generate a specialized program code for argument types. Type(s)↦ Expr Same as function call. indistinguishable syntax from a calling site @generatedfunction_sub2ind{N,M}(dims::NTuple{N,Integer}, subs::NTuple{M,Integer}) meta=Expr(:meta,:inline) ex=:(subs[$M]-1) fori=M-1:-1:1 ifi>N ex=:(subs[$i]-1+$ex) else ex=:(subs[$i]-1+dims[$i]*$ex) end end Expr(:block,meta,:($ex+1)) end 37 / 72
  38. 38. Non­standard String Literals Generate an expression from a string. String↦ Expr Denoted as <literalname>"..." Regular expression literal (e.g. r"^>[^n]+n[ACGTN]+") is an example. In Bio.jl, dna"ACGT"is converted to a DNASequenceobject. macror_str(s) Regex(s) end #Regexobject r"^>[^n]+n[ACGTN]+" #DNASequenceobject dna"ACGT" 38 / 72
  39. 39. Modules Modules are namespace. Names right under a module are considered as global names. Import/export system enables to exchange names between modules. moduleFoo exportfoo,gvar #function foo()=println("hello,foo") bar()=println("hello,bar") #globalvariable constgvar=42 end Foo.foo() Foo.bar() Foo.gvar importFoo:foo foo() importFoo:bar bar() usingFoo foo() gvar 39 / 72
  40. 40. Packages A package manager is bundled with Julia. No other package manager; this is the standard. The package manager can build, install, and create packages. Almost all packages are hosted on GitHub. Registered packages Registered packages are public packages that can be installed by name. List: http://pkg.julialang.org/ Repository: https://github.com/JuliaLang/METADATA.jl 40 / 72
  41. 41. Packages ­ Management The package manager is accessible from REPL. Pkg.update(): update registered package data and upgrade packages The way to install a package depends on whether the package is registered or not. Pkg.add(<package>): install a registered package Pkg.clone(<url>): install a package from the git URL julia>Pkg.update() julia>Pkg.add("DocOpt") julia>Pkg.clone("git@github.com:docopt/DocOpt.jl.git") 41 / 72
  42. 42. Packages ­ Create a Package Package template can be generated with Pkg.generate(<package>). This generates a disciplined scaffold to develop a new package. Generated packages will be located in ~/.julia/v0.4/. Pkg.tag(<package>,<version>)tags the version to the current commit of the package. This tag is considered as a release of the package. Developers should follow Semantic Versioning. major: incompatible API changes minor: backwards-compatible functionality addition patch: backwards-compatible bug fixes julia>Pkg.generate("DocOpt") julia>Pkg.tag("DocOpt",:patch) #patchupdate 42 / 72
  43. 43. BioJulia 43 / 72
  44. 44. BioJulia Collaborative project to build bioinformatics infrastructure for Julia. Packages: Bio.jl - https://github.com/BioJulia/Bio.jl Other packages - https://github.com/BioJulia 44 / 72
  45. 45. BioJulia ­ Basic Principles BioJulia will be fast. All contributions undergo code review. We'll design it to suit modern bioinformatics and Julia, not just copy other Bio-projects. https://github.com/BioJulia/Bio.jl/wiki/roadmap 45 / 72
  46. 46. Bio.jl Major modules: Bio.Seq: biological sequences Bio.Intervals: genomic intervals Bio.Align: sequence alignments (coming soon!) Bio.Phylo: phylogenetics (common soon!) Under (active!) development. 46 / 72
  47. 47. Bio.jl Major modules: Bio.Seq: biological sequences Bio.Intervals: genomic intervals Bio.Align: sequence alignments (coming soon!) Bio.Phylo: phylogenetics (common soon!) Under (active!) development. 47 / 72
  48. 48. Sequences Sequence types are defined in Bio.Seqmodule: DNASequence, RNASequence, AminoAcidSequence, Kmer julia>usingBio.Seq julia>dna"ACGTN" #non-standardstringliteral 5ntDNASequence ACGTN julia>rna"ACGUN" 5ntRNASequence ACGUN julia>aa"ARNDCWYV" 8aaSequence: ARNDCWYV julia>kmer(dna"ACGT") DNA4-mer: ACGT 48 / 72
  49. 49. Sequences ­ Packed Nucleotides A/C/G/Tare packed into an array with 2-bit encoding (+1 bit for N). typeNucleotideSequence{T<:Nucleotide}<:Sequence data::Vector{UInt64}#2-bitencodedsequence ns::BitVector #'N'mask ... end In Kmer, nucleotides are packed into a 64-bit type. bitstype64Kmer{T<:Nucleotide,K} typealiasDNAKmer{K}Kmer{DNANucleotide,K} typealiasRNAKmer{K}Kmer{RNANucleotide,K} 49 / 72
  50. 50. Sequences ­ Immutable by Convention Sequences are immutable by convention. No copy when creating a subsequence from an existing sequence. julia>seq=dna"ACGTATG" 7ntDNASequence ACGTATG julia>seq[2:4] 3ntDNASequence CGT #internaldataissharedbetween #theoriginalanditssubsequences julia>seq.data===seq[2:4].data true 50 / 72
  51. 51. Intervals Genomic interval types are defined in Bio.Intervalsmodule: Interval{T}: Tis the type of metadata attached to the interval. typeInterval{T}<:AbstractInterval{Int64} seqname::StringField first::Int64 last::Int64 strand::Strand metadata::T end This is useful when annotating a genomic range: julia>usingBio.Intervals julia>Interval("chr2",5692667,5701385,'+',"SOX11") chr2:5692667-5701385 + SOX11 51 / 72
  52. 52. Intervals ­ Indexed Collections Set of intervals can be indexed by IntervalCollection: immutableCDS;gene::ASCIIString;index::Int;end ivals=IntervalCollection{CDS}() push!(ivals,Interval("chr6",156777930,156779471,'+', CDS("ARID1B",1))) push!(ivals,Interval("chr6",156829227,156829421,'+', CDS("ARID1B",2))) push!(ivals,Interval("chr6",156901376,156901525,'+', CDS("ARID1B",3))) intersectiterates over intersecting intervals: julia>query=Interval("chr6",156829200,156829300); julia>foriinintersect(ivals,query) println(i) end chr6:156829227-156829421 + CDS("ARID1B",2) 52 / 72
  53. 53. Parsers Parsers are generated from the Ragel state machine compiler. Finite state machines are described in regular language. The Ragel compiler generates pure Julia programs. Actions can be injected into the state transition. The next Ragel release (v7) will be shipped with the Julia generator. http://www.colm.net/open-source/ragel/ 53 / 72
  54. 54. Parsers ­ FASTA <name>=<expression>><enteringaction>%<leavingaction>; FASTA parser: newline ='r'?'n' >count_line; hspace =[tv]; whitespace =space|newline; identifier =(any-space)+ >mark %identifier; description=((any-hspace)[^rn]*)>mark %description; letters =(any-space-'>')+ >mark %letters; sequence =whitespace*letters?(whitespace+letters)*; fasta_entry='>'identifier(hspace+description)?newline sequencewhitespace*; main:=whitespace*(fasta_entry%finish_match)**; https://github.com/BioJulia/Bio.jl/blob/master/src/seq/fasta.rl https://github.com/BioJulia/Bio.jl/blob/master/src/seq/fasta.jl 54 / 72
  55. 55. Parsers ­ Fast Ragel can generate fast parsers. julia>@timeforrecinopen("hg38.fa",FASTA) println(rec) end >chr1 248956422ntMutableDNASequence NNNNNNNNNNNNNNNNNNNNNNN…NNNNNNNNNNNNNNNNNNNNNNNN >chr10 133797422ntMutableDNASequence NNNNNNNNNNNNNNNNNNNNNNN…NNNNNNNNNNNNNNNNNNNNNNNN #... >chrY_KI270740v1_random 37240ntMutableDNASequence TAATAAATTTTGAAGAAAATGAA…GAATGAAGCTGCAGACATTTACGG 32.198314seconds(174.92kallocations:1.464GB,1.14%gctime) 55 / 72
  56. 56. Alignments The Bio.Alignmodule supports various pairwise alignment types. Score maximization: GlobalAlignment SemiGlobalAlignment OverlapAlignment LocalAlignment Cost minimization: EditDistance LevenshteinDistance HammingDistance 56 / 72
  57. 57. Alignments ­ Simple Interfaces (1) julia>affinegap=AffineGapScoreModel(match=5, mismatch=-4, gap_open=-3, gap_extend=-2); julia>pairalign(GlobalAlignment(), dna"ATGGTGACT", dna"ACGTGCCCT", affinegap) PairwiseAlignment{Int64,Bio.Seq.NucleotideSequence{Bio.Seq.DNANucleotide},B score:12 seq:ATGGTGAC-T |||||| ref:ACG-TGCCCT 57 / 72
  58. 58. Alignments ­ Simple Interfaces (2) pairalign(<type>,<seq1>,<seq2>,<score/costmodel>) pairalign(GlobalAlignment(),a,b,model) pairalign(SemiGlobalAlignment(),a,b,model) pairalign(OverlapAlignment(),a,b,model) pairalign(LocalAlignment(),a,b,model) pairalign(EditDistance(),a,b,model) pairalign(LevenshteinDistance(),a,b) pairalign(HammingDistance(),a,b) Alignment options: pairalign(GlobalAlignment(),a,b,model,banded=true) pairalign(GlobalAlignment(),a,b,model,score_only=true) 58 / 72
  59. 59. Alignments ­ Speed (1) Global alignment of titin sequences (human and mouse): affinegap=AffineGapScoreModel(BLOSUM62,-10,-1) a=first(open("Q8WZ42.fasta",FASTA)).seq b=first(open("A2ASS6.fasta",FASTA)).seq @timealn=pairalign( GlobalAlignment(), Vector{AminoAcid}(a), Vector{AminoAcid}(b), affinegap, ) println(score(aln)) 8.012499seconds(601.99kallocations:1.155GB,0.09%gctime) 165611 vs. R (Biostrings): user systemelapsed 14.042 1.233 15.475 59 / 72
  60. 60. Alignments ­ Speed (2) vs. R (Biostrings): user systemelapsed 14.042 1.233 15.475 library(Biostrings,quietly=T) a=readAAStringSet("Q8WZ42.fasta")[[1]] b=readAAStringSet("A2ASS6.fasta")[[1]] t0=proc.time() aln=pairwiseAlignment(a,b,type="global", substitutionMatrix="BLOSUM62", gapOpening=10,gapExtension=1) t1=proc.time() print(t1-t0) print(score(aln)) 60 / 72
  61. 61. Indexable Bit Vectors Bit vectors that supports bit counting in constant time. rank1(bv,i): Count the number of 1 bits within bv[1:i]. rank0(bv,i): Count the number of 0 bits within bv[1:i]. A fundamental data structure when defining other data structures. WaveletMatrix, a generalization of the indexable bit vector, depends on this data structure. 'N'nucleotides in a reference sequence can be compressed using this data structure. julia>bv=SucVector(bitrand(10_000_000)); julia>rank1(bv,9_000_000); #precompile julia>@timerank1(bv,9_000_000) 0.000006seconds(149allocations:10.167KB) 4502258 61 / 72
  62. 62. Indexable Bit Vectors ­ Internals A bit vector is divided into 256-bit large blocks and each large block is divided into 64-bit small blocks: immutableBlock #largeblock large::UInt32 #smallblocks smalls::NTuple{4,UInt8} #bitchunks(64bits×4=256bits) chunks::NTuple{4,UInt64} end Each block has a cache that counts the number of 1s. 62 / 72
  63. 63. FM­Indexes Index for full-text search. Fast, compact, and often used in short-read sequence mappers (Bowtie2, BWA, etc.). Product of Julia Summer of Code 2015 https://github.com/BioJulia/FMIndexes.jl This package is not specialized for biological sequences. FMIndexes.jl does not depend on Bio.jl. JIT compiler can optimize code for a specific type at runtime. julia>fmindex=FMIndex(dna"ACGTATTGACTGTA"); julia>count(dna"TA",fmindex) 2 julia>count(dna"TATT",fmindex) 1 63 / 72
  64. 64. FM­Indexed ­ Queries Create an FM-Index for chromosome 22: julia>fmindex=FMIndex(first(open("chr22.fa",FASTA)).seq); count(pattern,index): count the number of occurrences of pattern: julia>count(dna"ACGT",fmindex) 37672 julia>count(dna"ACGTACGT",fmindex) 42 64 / 72
  65. 65. FM­Indexed ­ Queries Create an FM-Index for chromosome 22: julia>fmindex=FMIndex(first(open("chr22.fa",FASTA)).seq); locate(pattern,index): locate positions of pattern: #locatereturnsaniterator julia>locate(dna"ACGTACGT",fmindex)|>collect 42-elementArray{Any,1}: 20774876 ⋮ 22729149 #locateallreturnsanarray julia>locateall(dna"ACGTACGT",fmindex) 42-elementArray{Int64,1}: 20774876 ⋮ 22729149 65 / 72
  66. 66. Other Julia Orgs You Should Know Statistics - JuliaStats https://github.com/JuliaStats https://github.com/JuliaStats/StatsBase.jl https://github.com/JuliaStats/DataFrames.jl https://github.com/JuliaStats/Clustering.jl https://github.com/JuliaStats/Distributions.jl https://github.com/JuliaStats/MultivariateStats.jl https://github.com/JuliaStats/NullableArrays.jl https://github.com/JuliaStats/GLM.jl 66 / 72
  67. 67. Other Julia Orgs You Should Know Optimization - JuliaOpt https://github.com/JuliaOpt https://github.com/JuliaOpt/JuMP.jl https://github.com/JuliaOpt/Optim.jl https://github.com/JuliaOpt/Convex.jl Graphs - JuliaGraphs https://github.com/JuliaGraphs https://github.com/JuliaGraphs/LightGraphs.jl Database - JuliaDB https://github.com/JuliaDB https://github.com/JuliaDB/SQLite.jl https://github.com/JuliaDB/PostgreSQL.jl 67 / 72
  68. 68. Julia Updates '15 68 / 72
  69. 69. Julia Updates '15 Julia Computing Inc. was founded. "Why the creators of the Julia programming language just launched a startup" - http://venturebeat.com/2015/05/18/why-the- creators-of-the-julia-programming-language-just-launched-a- startup/ 69 / 72
  70. 70. Julia Updates '15 Julia Computing Inc. was founded. "Why the creators of the Julia programming language just launched a startup" - http://venturebeat.com/2015/05/18/why-the- creators-of-the-julia-programming-language-just-launched-a- startup/ Moore foundation granted Julia Computing $600,000. "Bringing Julia from beta to 1.0 to support data-intensive, scientific computing" - https://www.moore.org/newsroom/in-the- news/2015/11/10/bringing-julia-from-beta-to-1.0-to-support-data- intensive-scientific-computing 70 / 72
  71. 71. Julia Updates '15 Julia Computing Inc. was founded. "Why the creators of the Julia programming language just launched a startup" - http://venturebeat.com/2015/05/18/why-the- creators-of-the-julia-programming-language-just-launched-a- startup/ Moore foundation granted Julia Computing $600,000. "Bringing Julia from beta to 1.0 to support data-intensive, scientific computing" - https://www.moore.org/newsroom/in-the- news/2015/11/10/bringing-julia-from-beta-to-1.0-to-support-data- intensive-scientific-computing Multi-threading Support https://github.com/JuliaLang/julia/pull/13410 71 / 72
  72. 72. Julia Updates '15 Julia Computing Inc. was founded. "Why the creators of the Julia programming language just launched a startup" - http://venturebeat.com/2015/05/18/why-the- creators-of-the-julia-programming-language-just-launched-a- startup/ Moore foundation granted Julia Computing $600,000. "Bringing Julia from beta to 1.0 to support data-intensive, scientific computing" - https://www.moore.org/newsroom/in-the- news/2015/11/10/bringing-julia-from-beta-to-1.0-to-support-data- intensive-scientific-computing Multi-threading Support https://github.com/JuliaLang/julia/pull/13410 Intel released ParallelAccelerator.jl https://github.com/IntelLabs/ParallelAccelerator.jl 72 / 72

×