This document provides an introduction to the Julia programming language for bioinformatics. It discusses the creator and goals of Julia, key features like speed, simplicity and dynamism. It also covers Julia's batteries included approach with built-in types, functions and libraries. The document gives examples of Julia language elements like literals, functions, types and metaprogramming capabilities like macros.
Julia is a high performance high level dynamic language.
Julia was First Appeared in 2012.It was Designed by Jeff Bezanson, Stefan Karpinski, Viral B. Shah, Alan Edelman (MIT Group Leader).Which can be used in Linux OS X,Windows and in FREEBSD.The syntax of Julia is similar to MATLAB® and consequently MATLAB® programmers should feel immediately comfortable with Julia
Julia programming language is a high-level, high-performance dynamic programming language for technical computing. It can be applied for Data Science, Machine Learning tasks, the web, among others. These slides are a brief introduction to this amazing language that facilitates my daily activities as Data Science and Software Engineer. For more information about the language access http://julialang.org/.
Julia is a high performance high level dynamic language.
Julia was First Appeared in 2012.It was Designed by Jeff Bezanson, Stefan Karpinski, Viral B. Shah, Alan Edelman (MIT Group Leader).Which can be used in Linux OS X,Windows and in FREEBSD.The syntax of Julia is similar to MATLAB® and consequently MATLAB® programmers should feel immediately comfortable with Julia
Julia programming language is a high-level, high-performance dynamic programming language for technical computing. It can be applied for Data Science, Machine Learning tasks, the web, among others. These slides are a brief introduction to this amazing language that facilitates my daily activities as Data Science and Software Engineer. For more information about the language access http://julialang.org/.
Here we are going to take a look how to use for loop, foreach loop and while loop. Also we are going to learn how to use and invoke methods and how to define classes in Java programming language.
In this chapter we will get familiar with primitive types and variables in Java – what they are and how to work with them. First we will consider the data types – integer types, real types with floating-point, Boolean, character, string and object type. We will continue with the variables, with their characteristics, how to declare them, how they are assigned a value and what is variable initialization.
A long time ago in a galaxy far, far away...
Java open source developers managed to the see the previously secret plans to the Empire's ultimate weapon, the JAVA™ COLLECTIONS FRAMEWORK.
Evading the dreaded Imperial Starfleet, a group of freedom fighters investigate the performance of the Empire’s most popular weapons: LinkedList, ArrayList and HashMap. In addition, they investigate common developer errors and bugs to help protect their vital software. With this new found knowledge they strike back!
Pursued by the Empire's sinister agents, JDuchess races home aboard her JVM, investigating proposed future changes to the Java Collections and other options such as Immutable Collections which could save her people and restore freedom to the galaxy....
Generics are one of the most complex features of Java. They are often poorly understood and lead to confusing errors. Unfortunately, it won’t get easier. Java 10, release planned for 2018, extends Generics. It’s now time to understand generics or risk being left behind.
We start by stepping back into the halcyon days of 2004 and explain why generics were introduced in the first place back. We also explain why Java’s implementation is unique compared to similar features in other programming languages.
Then we travel to the present to explaining how to make effective use of Generics. We then explore various entertaining code examples and puzzlers of how Generics are used today.
Finally, this talk sheds light on the planned changes in Java 10 with practical code examples and related ideas from other programming languages. If you ever wanted to understand the buzz around higher kinded types or declaration site variance now is your chance!
19. Data Structures and Algorithm ComplexityIntro C# Book
In this chapter we will compare the data structures we have learned so far by the performance (execution speed) of the basic operations (addition, search, deletion, etc.). We will give specific tips in what situations what data structures to use. We will explain how to choose between data structures like hash-tables, arrays, dynamic arrays and sets implemented by hash-tables or balanced trees. Almost all of these structures are implemented as part of NET Framework, so to be able to write efficient and reliable code we have to learn to apply the most appropriate structures for each situation.
After a recap of implicits I introduce the type class mechanics in Scala. Then I have a look at ways for good non-intrusive type class design. The main focus of this presentation are type classes in Scala. In the last chapter I show the Haskell implementation of my example.
A presentation given at the Programming Languages Meetup in San Francisco (Jun 10, 2014). Computation is about communicating state machines, but the message is lost in the endless debates on threads vs. events, iterators vs.. reactive approaches. There are lightweight coroutine and thread options available in all major mainstream languages, which help combine the easy sequential thread programming, with performance of event-oriented code. You can have it all.
Outline of my experience introducing and championing the usage of Julia Language inside a medium sized Financial organization. Cover reason for using it, reason why I thought it was a good fit and some advice on how to improve other first experience
Here we are going to take a look how to use for loop, foreach loop and while loop. Also we are going to learn how to use and invoke methods and how to define classes in Java programming language.
In this chapter we will get familiar with primitive types and variables in Java – what they are and how to work with them. First we will consider the data types – integer types, real types with floating-point, Boolean, character, string and object type. We will continue with the variables, with their characteristics, how to declare them, how they are assigned a value and what is variable initialization.
A long time ago in a galaxy far, far away...
Java open source developers managed to the see the previously secret plans to the Empire's ultimate weapon, the JAVA™ COLLECTIONS FRAMEWORK.
Evading the dreaded Imperial Starfleet, a group of freedom fighters investigate the performance of the Empire’s most popular weapons: LinkedList, ArrayList and HashMap. In addition, they investigate common developer errors and bugs to help protect their vital software. With this new found knowledge they strike back!
Pursued by the Empire's sinister agents, JDuchess races home aboard her JVM, investigating proposed future changes to the Java Collections and other options such as Immutable Collections which could save her people and restore freedom to the galaxy....
Generics are one of the most complex features of Java. They are often poorly understood and lead to confusing errors. Unfortunately, it won’t get easier. Java 10, release planned for 2018, extends Generics. It’s now time to understand generics or risk being left behind.
We start by stepping back into the halcyon days of 2004 and explain why generics were introduced in the first place back. We also explain why Java’s implementation is unique compared to similar features in other programming languages.
Then we travel to the present to explaining how to make effective use of Generics. We then explore various entertaining code examples and puzzlers of how Generics are used today.
Finally, this talk sheds light on the planned changes in Java 10 with practical code examples and related ideas from other programming languages. If you ever wanted to understand the buzz around higher kinded types or declaration site variance now is your chance!
19. Data Structures and Algorithm ComplexityIntro C# Book
In this chapter we will compare the data structures we have learned so far by the performance (execution speed) of the basic operations (addition, search, deletion, etc.). We will give specific tips in what situations what data structures to use. We will explain how to choose between data structures like hash-tables, arrays, dynamic arrays and sets implemented by hash-tables or balanced trees. Almost all of these structures are implemented as part of NET Framework, so to be able to write efficient and reliable code we have to learn to apply the most appropriate structures for each situation.
After a recap of implicits I introduce the type class mechanics in Scala. Then I have a look at ways for good non-intrusive type class design. The main focus of this presentation are type classes in Scala. In the last chapter I show the Haskell implementation of my example.
A presentation given at the Programming Languages Meetup in San Francisco (Jun 10, 2014). Computation is about communicating state machines, but the message is lost in the endless debates on threads vs. events, iterators vs.. reactive approaches. There are lightweight coroutine and thread options available in all major mainstream languages, which help combine the easy sequential thread programming, with performance of event-oriented code. You can have it all.
Outline of my experience introducing and championing the usage of Julia Language inside a medium sized Financial organization. Cover reason for using it, reason why I thought it was a good fit and some advice on how to improve other first experience
In this article I will explore why I think that deadlines should never be communicated to the development teams, and why all deadlines are basically meaningless anyway.
This presentation is about troubleshooting and debugging in Android applications, main sources of problems in new applications as well as instruments and approaches, which can help foresee and avoid most mistakes during the development.
Presentation by Mariia Sorokina, Android-developer, GlobalLogic. Mobile TechTalk, Lviv, 2014.
More details - www.globallogic.com.ua/press-releases/mobile-techtalk-lviv/
20 issues of porting C++ code on the 64-bit platformPVS-Studio
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
Program errors occurring while porting C++ code from 32-bit platforms on 64-b...Andrey Karpov
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
20 issues of porting C++ code on the 64-bit platformAndrey Karpov
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
This article contains various examples of 64-bit errors. However, we have learnt much more examples and types of errors since we started writing the article and they were not included into it. Please see the article "A Collection of Examples of 64-bit Errors in Real Programs" that covers defects in 64-bit programs we know of most thoroughly. We also recommend you to study the course "Lessons on development of 64-bit C/C++ applications" where we describe the methodology of creating correct 64-bit code and searching for all types of defects using the Viva64 code analyzer.
Development of a static code analyzer for detecting errors of porting program...PVS-Studio
The article concerns the task of developing a program tool called static analyzer. The tool being developed is used for diagnosing potentially unsafe syntactic structures of C++ from the viewpoint of porting program code on 64-bit systems. Here we focus not on the problems of porting occurring in programs, but on the peculiarities of creating a specialized code analyzer. The analyzer is intended for working with the code of C/C++ programs.
Comparison of analyzers' diagnostic possibilities at checking 64-bit codePVS-Studio
The article compares a specialized static analyzer Viva64 with universal static analyzers Parasoft C++Test and Gimpel Software PC-Lint. The comparison is carried within the framework of the task of porting 32-bit C/C++ code on 64-bit systems or developing new code with taking into account peculiarities of 64-bit architecture.
Monitoring a program that monitors computer networksAndrey Karpov
There exists the NetXMS project, which is a software product designed to monitor computer systems and networks. It can be used to monitor the whole IT-infrastructure, from SNMP-compatible devices to server software. And I am naturally going to monitor the code of this project with the PVS-Studio analyzer.
The static code analysis rules for diagnosing potentially unsafe construction...Sergey Vasilyev
The article formulates the rules of diagnosing potentially unsafe syntactic constructions in source code of C++ programs and describes the principles of building a static source code analyzer implementing support of the mentioned rules.
The article will help the readers understand what size_t and ptrdiff_t types are, what they are used for and when they must be used. The article will be interesting for those developers who begin creation of 64-bit applications where use of size_t and ptrdiff_t types provides high performance, possibility to operate large data sizes and portability between different platforms.
Headache from using mathematical softwarePVS-Studio
It so happened that during some period of time I was discussing on the Internet, one would think, different topics: free alternatives of Matlab for universities and students, and finding errors in algorithms with the help of static code analysis. All these discussions were brought together by the terrible quality of the code of modern programs. In particular, it is about quality of software for mathematicians and scientists. Immediately there arises the question of the credibility to the calculations and studies conducted with the help of such programs. We will try to reflect on this topic and look for the errors.
A Collection of Examples of 64-bit Errors in Real ProgramsAndrey Karpov
This article is the most complete collection of examples of 64-bit errors in the C and C++ languages. The article is intended for Windows-application developers who use Visual C++, however, it will be useful for other programmers as well.
A Collection of Examples of 64-bit Errors in Real ProgramsPVS-Studio
This article is the most complete collection of examples of 64-bit errors in the C and C++ languages. The article is intended for Windows-application developers who use Visual C++, however, it will be useful for other programmers as well.
Monitoring a program that monitors computer networksPVS-Studio
There exists the NetXMS project, which is a software product designed to monitor computer systems and networks. It can be used to monitor the whole IT-infrastructure, from SNMP-compatible devices to server software. And I am naturally going to monitor the code of this project with the PVS-Studio analyzer.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
3. About Me
Graduate school student at the University of Tokyo.
About 2-year experience of Julia programming.
Contributing to Julia and its ecosystem:
https://github.com/docopt/DocOpt.jl
https://github.com/bicycle1885/IntArrays.jl
https://github.com/BioJulia/IndexableBitVectors.jl
https://github.com/BioJulia/WaveletMatrices.jl
https://github.com/BioJulia/FMIndexes.jl
https://github.com/isagalaev/highlight.js (Julia support)
etc.
Core developer of BioJulia - https://github.com/BioJulia/Bio.jl
Julia Summer of Code 2015 Student -
http://julialang.org/blog/2015/10/biojulia-sequence-analysis/
3 / 72
9. Simple
Syntax with least astonishment
no semicolons
no variable declarations
no argument types
Unicode support
1-based index
blocks end with end
No implicit type conversion
Quick sort with 24 lines
quicksort(xs)=quicksort!(copy(xs))
quicksort!(xs)=quicksort!(xs,1,endof(xs))
functionquicksort!(xs,lo,hi)
iflo<hi
p=partition(xs,lo,hi)
quicksort!(xs,lo,p-1)
quicksort!(xs,p+1,hi)
end
returnxs
end
functionpartition(xs,lo,hi)
pivot=div(lo+hi,2)
pvalue=xs[pivot]
xs[pivot],xs[hi]=xs[hi],xs[pivot]
j=lo
@inboundsforiinlo:hi-1
ifxs[i]≤pvalue
xs[i],xs[j]=xs[j],xs[i]
j+=1
end
end
xs[j],xs[hi]=xs[hi],xs[j]
returnj
end
9 / 72
11. Fast
The LLVM-backed JIT compiler emits machine code at runtime.
julia>4>>1 #bitwiseright-shiftfunction
2
julia>@code_native4>>1
.section __TEXT,__text,regular,pure_instructions
Filename:int.jl
Sourceline:115
pushq %rbp
movq %rsp,%rbp
movl $63,%ecx
cmpq $63,%rsi
Sourceline:115
cmovbeq%rsi,%rcx
sarq %cl,%rdi
movq %rdi,%rax
popq %rbp
ret
11 / 72
12. Dynamic
No need to precompile your program.
hello.jl:
println("hello,world")
Output:
$juliahello.jl
hello,world
In REPL:
julia>include("hello.jl")
hello,world
12 / 72
14. Who Created?
Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman
Soon the team was building their dream language.
MIT, where Bezanson is a graduate student, became
an anchor for the project, with much of the work
being done within computer scientist and
mathematician Alan Edelman’s research group. But
development of the language remained completely
distributed. “Jeff and I didn’t actually meet until we’d
been working on it for over a year, and Viral was in
India the entire time,” Karpinski says. “So the whole
language was designed over email.”
— "Out in the Open: Man Creates One Programming Language to Rule Them All"
http://www.wired.com/2014/02/julia/
“
14 / 72
16. Why Created?
The creators wanted a language that satisfies:
the speed of C
with the dynamism of Ruby
macros like Lisp
mathematical notations like Matlab
as usable for general programming as Python
as easy for statistics as R
as natural for string processing as Perl
as powerful for linear algebra as Matlab
as good at gluing programs together as the shell
16 / 72
17. Batteries Included
You can start technical computing without installing lots of libraries.
Numeric types
{8, 16, 32, 64, 128}-bit {signed, unsigned} integers,
16, 32, 64-bit floating point numbers,
and arbitrary-precision numbers.
Numerical linear algebra
matrix multiplication, matrix decomposition/factorization, solver for
system of linear equations, and more!
sparse matrices
Random number generator
Mersenne-Twister method accelerated by SIMD
17 / 72
18. Batteries Included
You can start technical computing without installing lots of libraries.
Unicode support
Perl-compatible regular expressions (PCRE)
Parallel computing
Dates and times
Unit tests
Profiler
Package manager
18 / 72
21. Functions
All function definitions below are equivalent:
functionfunc(x,y)
returnx+y
end
functionfunc(x,y)
x+y
end
func(x,y)=returnx+y
func(x,y)=x+y
Force inlining:
@inlinefunc(x,y)=x+y
This simple function will be automatically inlined by the compiler.❏
21 / 72
23. Functions Return Values
You can return multiple values from a function as a tuple:
functiondivrem64(n)
returnn>>6,n&0b111111
end
And you can receive returned values with multiple assignments:
julia>divrem64(1025)
(16,1)
julia>d,r=divrem64(1025)
(16,1)
julia>d
16
julia>r
1
23 / 72
24. Functions Document
A document string can be attached to a function definition:
"""
Thisfunctioncomputesquotientandremainder
dividedby64foranon-negativeinteger.
"""
functiondivrem64(n)
returnn>>6,n&0b111111
end
In REPL, you can read the attached document with the ?command:
help?>divrem64
search:divrem64divrem
Thisfunctioncomputesquotientandremainder
dividedby64foranon-negativeinteger.
24 / 72
25. Types
Two kinds of types:
concrete types: instantiatable
abstract types: not instantiatable
25 / 72
27. Parametric Types
Types can take type parameters:
typePoint{T}
x::T
y::T
end
Point: abstract type
Point{Int64}: concrete type
subtype of Point(Point{Int64}<:Point)
all of the members (i.e. xand y) are Int64s
typeNucleotideSequence{T<:Nucleotide}<:Sequence
data::Vector{UInt64}
...
end
27 / 72
28. Constructors
Julia automatically generates default constructors.
Point(1,2)creates an object of Point{Int}type.
Point(1.0,2.0)creates an object of Point{Float64}type.
Point{Float64}(1,2)creates an object of Point{Float64}type.
Users can create custom constructors.
typePoint{T}
x::T
y::T
end
#outerconstructor
functionPoint(x)
returnPoint(x,x)
end
p=Point(1) #>Point{Int64}(1,1)
28 / 72
29. Memory Layout
Compact memory layout like C's structs
C compatible memory layout
You can pass Julia objects to C functions without copy.
This is especially important in bioinformatics
when defining data structures for efficient algorithms
when handling lots of small objects
julia>@enumStrandforwardreversebothunknown
julia>immutableExon
chrom::Int
start::Int
stop::Int
strand::Strand
end
julia>sizeof(Exon(1,12345,12446,forward))
32
29 / 72
30. Multiple Dispatch
Combination of all argument types determines a called method.
Single dispatch (e.g. Python)
The first argument is special and
determines a method.
Multiple dispatch (e.g. Julia)
All arguments are equally
responsible to determine a
method.
classSerializer:
defwrite(self,val):
ifisinstance(val,int)
#...
elifisinstance(val,float)
#...
#...
functionwrite(dst::Serializer,
val::Int64)
#...
end
functionwrite(dst::Serializer,
val::Float64)
#...
end
#...
30 / 72
33. Metaprogramming
Julia can represent its own program code as a data structure (Expr).
Three metaprogramming components in Julia:
Macros
generate an expression from expressions.
Expr↦ Expr
Generated functions
generate an expression from types.
Types↦ Expr
Non-standard string literals
generate an expression from a string.
String↦ Expr
33 / 72
34. Metaprogramming Macros
Generate an expression from expressions.
Expr↦ Expr
Denoted as @<macroname>.
Distinguishable from function calls
We've already seen some macros.
macroassert(ex)
msg=string(ex)
:($ex?nothing:throw(AssertionError($msg)))
end
julia>x=-1
-1
julia>@assertx>1
ERROR:AssertionError:x>1
34 / 72
35. Metaprogramming Useful Macros (1)
@show: print variables, useful for debug:
julia>x=-1
-1
julia>@showx
x=-1
@inbounds: omit to check bounds:
@inboundsh[i,j]=h[i-1,j-1]+submat[a[i],b[j]]
@which: return which function will be called:
julia>@whichmax(1,2)
max{T<:Real}(x::T<:Real,y::T<:Real)atpromotion.jl:239
35 / 72
36. Metaprogramming Useful Macros (2)
@time: measure elapsed time to evaluate the expression:
julia>xs=rand(1_000_000);
julia>@timesum(xs)
0.022633seconds(27.24kallocations:1.155MB)
499795.2805424741
julia>@timesum(xs)
0.000574seconds(5allocations:176bytes)
499795.2805424741
@profile: profile the expression:
julia>sort(xs);@profilesort(xs);
julia>Profile.print()
69REPL.jl;anonymous;line:92
68REPL.jl;eval_user_input;line:62
...
36 / 72
37. Generated Functions
Generate a specialized program code for argument types.
Type(s)↦ Expr
Same as function call.
indistinguishable syntax from a calling site
@generatedfunction_sub2ind{N,M}(dims::NTuple{N,Integer},
subs::NTuple{M,Integer})
meta=Expr(:meta,:inline)
ex=:(subs[$M]-1)
fori=M-1:-1:1
ifi>N
ex=:(subs[$i]-1+$ex)
else
ex=:(subs[$i]-1+dims[$i]*$ex)
end
end
Expr(:block,meta,:($ex+1))
end
37 / 72
38. Nonstandard String Literals
Generate an expression from a string.
String↦ Expr
Denoted as <literalname>"..."
Regular expression literal (e.g. r"^>[^n]+n[ACGTN]+") is an
example.
In Bio.jl, dna"ACGT"is converted to a DNASequenceobject.
macror_str(s)
Regex(s)
end
#Regexobject
r"^>[^n]+n[ACGTN]+"
#DNASequenceobject
dna"ACGT"
38 / 72
39. Modules
Modules are namespace.
Names right under a module are considered as global names.
Import/export system enables to exchange names between
modules.
moduleFoo
exportfoo,gvar
#function
foo()=println("hello,foo")
bar()=println("hello,bar")
#globalvariable
constgvar=42
end
Foo.foo()
Foo.bar()
Foo.gvar
importFoo:foo
foo()
importFoo:bar
bar()
usingFoo
foo()
gvar
39 / 72
40. Packages
A package manager is bundled with Julia.
No other package manager; this is the standard.
The package manager can build, install, and create packages.
Almost all packages are hosted on GitHub.
Registered packages
Registered packages are public packages that can be installed by
name.
List: http://pkg.julialang.org/
Repository: https://github.com/JuliaLang/METADATA.jl
40 / 72
41. Packages Management
The package manager is accessible from REPL.
Pkg.update(): update registered package data and upgrade
packages
The way to install a package depends on whether the package is
registered or not.
Pkg.add(<package>): install a registered package
Pkg.clone(<url>): install a package from the git URL
julia>Pkg.update()
julia>Pkg.add("DocOpt")
julia>Pkg.clone("git@github.com:docopt/DocOpt.jl.git")
41 / 72
42. Packages Create a Package
Package template can be generated with Pkg.generate(<package>).
This generates a disciplined scaffold to develop a new package.
Generated packages will be located in ~/.julia/v0.4/.
Pkg.tag(<package>,<version>)tags the version to the current
commit of the package.
This tag is considered as a release of the package.
Developers should follow Semantic Versioning.
major: incompatible API changes
minor: backwards-compatible functionality addition
patch: backwards-compatible bug fixes
julia>Pkg.generate("DocOpt")
julia>Pkg.tag("DocOpt",:patch) #patchupdate
42 / 72
44. BioJulia
Collaborative project to build bioinformatics infrastructure for Julia.
Packages:
Bio.jl - https://github.com/BioJulia/Bio.jl
Other packages - https://github.com/BioJulia
44 / 72
45. BioJulia Basic Principles
BioJulia will be fast.
All contributions undergo code review.
We'll design it to suit modern bioinformatics and Julia, not just copy
other Bio-projects.
https://github.com/BioJulia/Bio.jl/wiki/roadmap
45 / 72
48. Sequences
Sequence types are defined in Bio.Seqmodule:
DNASequence, RNASequence, AminoAcidSequence, Kmer
julia>usingBio.Seq
julia>dna"ACGTN" #non-standardstringliteral
5ntDNASequence
ACGTN
julia>rna"ACGUN"
5ntRNASequence
ACGUN
julia>aa"ARNDCWYV"
8aaSequence:
ARNDCWYV
julia>kmer(dna"ACGT")
DNA4-mer:
ACGT
48 / 72
49. Sequences Packed Nucleotides
A/C/G/Tare packed into an array with 2-bit encoding (+1 bit for N).
typeNucleotideSequence{T<:Nucleotide}<:Sequence
data::Vector{UInt64}#2-bitencodedsequence
ns::BitVector #'N'mask
...
end
In Kmer, nucleotides are packed into a 64-bit type.
bitstype64Kmer{T<:Nucleotide,K}
typealiasDNAKmer{K}Kmer{DNANucleotide,K}
typealiasRNAKmer{K}Kmer{RNANucleotide,K}
49 / 72
50. Sequences Immutable by Convention
Sequences are immutable by convention.
No copy when creating a subsequence from an existing sequence.
julia>seq=dna"ACGTATG"
7ntDNASequence
ACGTATG
julia>seq[2:4]
3ntDNASequence
CGT
#internaldataissharedbetween
#theoriginalanditssubsequences
julia>seq.data===seq[2:4].data
true
50 / 72
51. Intervals
Genomic interval types are defined in Bio.Intervalsmodule:
Interval{T}: Tis the type of metadata attached to the interval.
typeInterval{T}<:AbstractInterval{Int64}
seqname::StringField
first::Int64
last::Int64
strand::Strand
metadata::T
end
This is useful when annotating a genomic range:
julia>usingBio.Intervals
julia>Interval("chr2",5692667,5701385,'+',"SOX11")
chr2:5692667-5701385 + SOX11
51 / 72
52. Intervals Indexed Collections
Set of intervals can be indexed by IntervalCollection:
immutableCDS;gene::ASCIIString;index::Int;end
ivals=IntervalCollection{CDS}()
push!(ivals,Interval("chr6",156777930,156779471,'+',
CDS("ARID1B",1)))
push!(ivals,Interval("chr6",156829227,156829421,'+',
CDS("ARID1B",2)))
push!(ivals,Interval("chr6",156901376,156901525,'+',
CDS("ARID1B",3)))
intersectiterates over intersecting intervals:
julia>query=Interval("chr6",156829200,156829300);
julia>foriinintersect(ivals,query)
println(i)
end
chr6:156829227-156829421 + CDS("ARID1B",2)
52 / 72
53. Parsers
Parsers are generated from the Ragel state machine compiler.
Finite state machines are described in regular language.
The Ragel compiler generates pure Julia programs.
Actions can be injected into the state transition.
The next Ragel release (v7) will be shipped with the Julia generator.
http://www.colm.net/open-source/ragel/
53 / 72
59. Alignments Speed (1)
Global alignment of titin sequences (human and mouse):
affinegap=AffineGapScoreModel(BLOSUM62,-10,-1)
a=first(open("Q8WZ42.fasta",FASTA)).seq
b=first(open("A2ASS6.fasta",FASTA)).seq
@timealn=pairalign(
GlobalAlignment(),
Vector{AminoAcid}(a),
Vector{AminoAcid}(b),
affinegap,
)
println(score(aln))
8.012499seconds(601.99kallocations:1.155GB,0.09%gctime)
165611
vs. R (Biostrings):
user systemelapsed
14.042 1.233 15.475
59 / 72
60. Alignments Speed (2)
vs. R (Biostrings):
user systemelapsed
14.042 1.233 15.475
library(Biostrings,quietly=T)
a=readAAStringSet("Q8WZ42.fasta")[[1]]
b=readAAStringSet("A2ASS6.fasta")[[1]]
t0=proc.time()
aln=pairwiseAlignment(a,b,type="global",
substitutionMatrix="BLOSUM62",
gapOpening=10,gapExtension=1)
t1=proc.time()
print(t1-t0)
print(score(aln))
60 / 72
61. Indexable Bit Vectors
Bit vectors that supports bit counting in constant time.
rank1(bv,i): Count the number of 1 bits within bv[1:i].
rank0(bv,i): Count the number of 0 bits within bv[1:i].
A fundamental data structure when defining other data structures.
WaveletMatrix, a generalization of the indexable bit vector,
depends on this data structure.
'N'nucleotides in a reference sequence can be compressed
using this data structure.
julia>bv=SucVector(bitrand(10_000_000));
julia>rank1(bv,9_000_000); #precompile
julia>@timerank1(bv,9_000_000)
0.000006seconds(149allocations:10.167KB)
4502258
61 / 72
62. Indexable Bit Vectors Internals
A bit vector is divided into 256-bit large blocks and each large block is
divided into 64-bit small blocks:
immutableBlock
#largeblock
large::UInt32
#smallblocks
smalls::NTuple{4,UInt8}
#bitchunks(64bits×4=256bits)
chunks::NTuple{4,UInt64}
end
Each block has a cache that counts the number of 1s.
62 / 72
63. FMIndexes
Index for full-text search.
Fast, compact, and often used in short-read sequence mappers
(Bowtie2, BWA, etc.).
Product of Julia Summer of Code 2015
https://github.com/BioJulia/FMIndexes.jl
This package is not specialized for biological sequences.
FMIndexes.jl does not depend on Bio.jl.
JIT compiler can optimize code for a specific type at runtime.
julia>fmindex=FMIndex(dna"ACGTATTGACTGTA");
julia>count(dna"TA",fmindex)
2
julia>count(dna"TATT",fmindex)
1
63 / 72
64. FMIndexed Queries
Create an FM-Index for chromosome 22:
julia>fmindex=FMIndex(first(open("chr22.fa",FASTA)).seq);
count(pattern,index): count the number of occurrences of pattern:
julia>count(dna"ACGT",fmindex)
37672
julia>count(dna"ACGTACGT",fmindex)
42
64 / 72
69. Julia Updates '15
Julia Computing Inc. was founded.
"Why the creators of the Julia programming language just
launched a startup" - http://venturebeat.com/2015/05/18/why-the-
creators-of-the-julia-programming-language-just-launched-a-
startup/
69 / 72
70. Julia Updates '15
Julia Computing Inc. was founded.
"Why the creators of the Julia programming language just
launched a startup" - http://venturebeat.com/2015/05/18/why-the-
creators-of-the-julia-programming-language-just-launched-a-
startup/
Moore foundation granted Julia Computing $600,000.
"Bringing Julia from beta to 1.0 to support data-intensive, scientific
computing" - https://www.moore.org/newsroom/in-the-
news/2015/11/10/bringing-julia-from-beta-to-1.0-to-support-data-
intensive-scientific-computing
70 / 72
71. Julia Updates '15
Julia Computing Inc. was founded.
"Why the creators of the Julia programming language just
launched a startup" - http://venturebeat.com/2015/05/18/why-the-
creators-of-the-julia-programming-language-just-launched-a-
startup/
Moore foundation granted Julia Computing $600,000.
"Bringing Julia from beta to 1.0 to support data-intensive, scientific
computing" - https://www.moore.org/newsroom/in-the-
news/2015/11/10/bringing-julia-from-beta-to-1.0-to-support-data-
intensive-scientific-computing
Multi-threading Support
https://github.com/JuliaLang/julia/pull/13410
71 / 72
72. Julia Updates '15
Julia Computing Inc. was founded.
"Why the creators of the Julia programming language just
launched a startup" - http://venturebeat.com/2015/05/18/why-the-
creators-of-the-julia-programming-language-just-launched-a-
startup/
Moore foundation granted Julia Computing $600,000.
"Bringing Julia from beta to 1.0 to support data-intensive, scientific
computing" - https://www.moore.org/newsroom/in-the-
news/2015/11/10/bringing-julia-from-beta-to-1.0-to-support-data-
intensive-scientific-computing
Multi-threading Support
https://github.com/JuliaLang/julia/pull/13410
Intel released ParallelAccelerator.jl
https://github.com/IntelLabs/ParallelAccelerator.jl
72 / 72