Your SlideShare is downloading. ×
Marcs (bio)perl course
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Marcs (bio)perl course


Published on

These are the lecture slides for the BITS training session "Introduction to programming in Bioperl". …

These are the lecture slides for the BITS training session "Introduction to programming in Bioperl".

See for more material:

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Remember to wear it !
  • Not often needed. Why might you need the braces ? String interpolation:$name = ‘Johnny’;Print “$name1”; # => nothing printedPrint “${name}1”; # => ‘Johnny1’
  • If there are more variables in the list than elements in the array, the extra variables are assigned the udefined value. If there are fewer variables than array elements, the extra elements are ignored.Distributiviteit: my ()
  • If there are more variables in the list than elements in the array, the extra variables are assigned the udefined value. If there are fewer variables than array elements, the extra elements are ignored.
  • Comma is operator: flattens (‘concatenates’) lists/arrays
  • Comma is operator: flattens (‘concatenates’) lists/arrays
  • No parens needed: comma operators produce list
  • main should have been called ‘our’ ;-)Not needed to use the family name when you are with your family. If you call John for dinner, John will know it’s him and you know who will come.But if your family has visitors of another family and they have a John in the family as well ...Family name + given name = fully qualified variable name
  • Transcript

    • 1. Marc Logghe
    • 2. Perl
      “A script is what you give an actor, but a program is what you give an audience.”
    • 3. Goals
      Perl Positioning System: find your way in the Perl World
      Write Once, Use Many Times
      Object Oriented Perl
      Thou shalt not be afraid of the Bioperl Beast
    • 4.
    • 5. Agenda Day1
      Perl refresher
      Arrays and lists
      Subroutines and functions
      Creating and running a Perl script
      References and advanced data structures
      Packages and modules
      Objects, (multiple) inheritance, polymorphism
    • 6. Agenda Day 2
      What is bioperl ?
      Taming the Bioperl Beast
      Finding modules
      Finding methods
      Sequence processing
      One image says more than 1000 words
    • 7. Variables
      Data of any type may be stored within three basic types of variables:
      Scalar (strings, numbers, references)
      Array (aka list but not quite the same)
      Hash (aka associative array)
      Variable names are always preceded by a “dereferencing symbol” or prefix. If needed: {}
      $ - Scalar variables
      @ - List variables
      % - Associative array aka hash variables
    • 8. Variables
      You do NOT have to
      Declare the variable before using it
      Define the variable’s data type
      Allocate memory for new data values
    • 9. Scalar variables
      Scalar variable stores a string, a number, a character, a reference, undef
      $name, ${name}, ${‘name’}
      More magic: $_
    • 10. Array variables
      Array variable stores a list of scalars
      @name, @{name}, @{‘name’}
      Map: index => scalar value
      zero-indexed (distance from start)
    • 11. Array variables
      List assignment:
      Individual assignment: $count[2] = 42
      Individual acces: print $count[2]
      Special variable $#<array name>
      Scalar context
      @count = (1, 2, 3, 4, 5);
      @count = (‘apple’, ‘bat’, ‘cat’);
      @count2 = @count;
    • 12. Array variables
      Access multiple values via array slice:
      Assign multiple values via array slice:
      print @array[3,2,4,1,0,-1];
      @array[3,2,4,1,0,-1] = @new_values;
    • 13. Lists
      List = temporary sequence of comma separated values usually in () or result of qw operator
      Array = container for a list
      Array initialization
      Extract values from array
      my @array = qw/blood sweat tears/;
      my ($var1, $var2[42], $var3, @var4) = @args;
      my ($var5, $var6) = @args;
    • 14. Lists
      List flattening
      Remember: each element of list must be a scalar, not another list
      => NOT hierarchical list of 3 elements
      • FLATTENED list of scalar values
      Individual access and slicing cf. arrays
      my @vehicles = (‘truck’, @cars, (‘tank’,’jeep’));
      my @vehicles = (‘truck’, @cars, (‘tank’,’jeep’))[2,-1];
    • 15. Hash variables
      Hash variables are denoted by the % dereferencing symbol.
      Hash variables is a list of key-value pairs
      Both keys and values must be scalar
      Notice the ‘=>’ aka ‘quotifying comma’
      my %fruit_color = ("apple", "red", "banana", "yellow");
      my %fruit_color = (
      apple => "red",
      banana => "yellow",
    • 16. Hash variables
      Individual access: $hash{key}
      Access multiple values via slice:
      Assign multiple values via slice:
      @slice = @hash{‘key2’,’key23’,’key4’}
      @hash{‘key2’,’key23’,’key4’} = @new_values;
    • 17. Non data types
      There are several predefined filehandles, including STDIN, STDOUT, STDERR and DATA (default opened).
      No prefix
      Code value aka subroutine
      Dereferencing symbol “&”
    • 18.
    • 19. Subroutines
      We can reuse a segment of Perl code by placing it within a subroutine.
      The subroutine is defined using the sub keyword and a name (= variable name !!).
      The subroutine body is defined by placing code statements within the {} code block symbols.
      sub MySubroutine{
      #Perl code goes here.
      my @args = @_;
    • 20. Subroutines
      To call a subroutine, prepend the name with the & symbol:
      &MySubroutine; # w/o arguments
      MySubroutine(); # with or w/o arguments
    • 21. Subroutines
      Arguments in underscore array variable (@_)
      List flattening !!
      my @results = MySubroutine(@arg1, ‘arg2’, (‘arg3’, ‘arg4’));
      sub MySubroutine{
      #Perl code goes here.
      my ($thingy, @args) = @_;
    • 22. Subroutines
      Return value
      Scalar value
      List value
      Return value
      Explicit with return function
      Implicit: value of the last statement
      sub MySubroutine{
      #Perl code goes here.
      my ($thingy, @args) = @_;
    • 23. Subroutines
      Calling contexts
      wantarray function
      Void => undef
      Scalar => 0
      List => 1
      my $num = getFiles($dir);
      my @files = getFiles($dir);
    • 24. Functions and operators
      Built-in routines
      Arguments at right hand side
      Sensible name (defined, open, print, ...)
    • 25. Functions
      Perl provides a rich set of built-in functions to help you perform common tasks.
      Several categories of useful built-in function include
      Arithmetic functions (sqrt, sin, … )
      List functions (push, chomp, … )
      String functions (length, substr, … )
      Existance functions (defined, undef)
    • 26. Array functions
      Array as queue: push/shift (FIFO)
      Array as stack: push/pop (LIFO)
    • 27. List functions
      chomp: remove newline from every element in the list
      map: kind of loop without escape, every element ($_) is ‘processed’
      grep: kind of filter
    • 28. Hash functions
      keys: returns the hash keys in random order
      values: returns values of the hash in random order but same order as keys function call
      each: returns (key, value) pairs
      delete: remove a particular key (and associated value) from a hash
    • 29. Operators
      Complex and subtle (=,<>, <=>, ?:, ->,=>,...)
      Symbolic name (+,<,>,&,!, ...)
    • 30. Operators
      Calling context
      Eg. assignment operator ‘=‘
      ($item1, $item2) = @array;
      $item1 = @array;
    • 31.
    • 32. perldoc
      Access to Perl’s documentation system
      Command line
      Documentation overview: perldoc perl
      Function info:
      perldoc perlfunc
      perldoc -f <function name>
      Operator info: perldoc perlop
      Searching the FAQs: perldoc -q <FAQ keyword>
    • 33. perldoc
      Looking up module info
      Documentation: perldoc <module>
      Installation path: perldoc -l <module>
      Source: perldoc -m <module>
      All installed modules: perldoc -q installed
    • 34. perldoc
      Move around
    • 35. perldoc
    • 36. Creating a script
      Text editor (vi, textpad, notepad++, ...)
      IDE (Komodo, Eclipse, EMACS, Geany, ...)
      Shebang (not for modules)
    • 37. Executing a script
      Command line
      .pl extension
      Shebang line
      chmod +x script
    • 38. Executing a script
      Geany IDE
    • 39.
    • 40. References(and referents)
      A reference is a special scalar value which “refers to” or “points to” any value.
      A variable name is one kind of reference that you are already familiar with. It’s a given name.
      Reference is a kind of private, internal, computer generated name
      A referent is the value that the reference is pointing to
    • 41. Creating References
      Method 1: references to variables are created by using the backslash() operator.
      $name = ‘bioperl’;
      $reference = $name;
      $array_reference = @array_name;
      $hash_reference = %hash_name;
      $subroutine_ref = &sub_name;
    • 42. Creating References
      Method 2:
      [ ITEMS ] makes a new, anonymous array and returns a reference to that array.
      { ITEMS } makes a new, anonymous hash, and returns a reference to that hash
      my $array_ref = [ 1, ‘foo’, undef, 13 ];
      my $hash_ref = {one => 1, two => 2};
    • 43. Dereferencing a Reference
      Use the appropriate dereferencing symbol
      Scalar: $
      Array: @
      Hash: %
      Subroutine: &
    • 44. Dereferencing a Reference
      Remember $name, ${‘name’} ?
      Means: give me the scalar value where the variable ‘name’ is pointing to.
      A reference $reference ìs a name, so $$reference, ${$reference}
      Means: give me the scalar value where the reference $reference is pointing to
    • 45. Dereferencing a Reference
      The arrow operator: ->
      Arrays and hashes
      my $array_ref = [ 1, ‘foo’, undef, 13 ];
      my $hash_ref = {one => 1, two => 2};
      ${$array_ref}[1] = ${$hash_ref}{‘two’}
      # can be written as:
      $array_ref->[1] = $hash_ref->{two}
      # can be written as:
      $sub_ref->($arg1, $arg2)
    • 46. Identifying a referent
      ref function
    • 47. References
      Why do we need references ???
      Create complex data structures
      !! Arrays and hashes can only store scalar values
      Pass arrays, hashes, subroutines, ... as arguments to subroutines and functions
      !! List flattening
    • 48. Complex data structures
      Reference is a scalar value
      Arrays and hashes are sets of scalar values
      In one go:
      my $array_ref = [ 1, 2, 3 ];
      my $hash_ref = {one => 1, two => 2};
      my %data = ( arrayref => $array_ref,
      hash_ref => $hash_ref);
      my %data = ( arrayref => [ 1, 2, 3 ],
      hash_ref => {one => 1, two => 2}
    • 49. Complex data structures
      Individual access
      my %data = ( arrayref => [ 1, 2, 3 ],
      hash_ref => {one => 1,
      two => [‘a’,’b’]});
      How to access this value ?
      my $wanted_value = $data{hash_ref}->{two}->[1];
    • 50. Complex data structures
      my @row1 = (1..3);
      my @row2 = (2,4,6);
      my @row3 = (3,6,9);
      my @rows = (@row1,@row2,@row3);
      my $table = @rows;
    • 51. Complex data structures
      my $table = [
      [1, 2, 3],
      [2, 4, 6],
      [3, 6, 9]
    • 52. Complex data structures
      Individual access
      my $wanted_value = $table->[1]->[2];
      # shorter form:
      $wanted_value = $table->[1][2]
    • 53. Packages and modules
      2 types of variables:
      Global aka package variables
      Lexical variables
    • 54. Packages and modules
      Global / package variables
      Visible everywhere in every program
      You get the if you don’t say otherwise
      !! Autovivification
      Name has 2 parts: family name + given name
      Default family name is ‘main’. $John is actually $main::John
      $Cleese::John has nothing to do with $Wayne::John
      Family name = package name
      $var1 = 42;
      print “$var1, “, ++$var2;
      # results in:
      42, 1
    • 55.
    • 56. Packages and modules
      Lexical / private variables
      Explicitely declared as
      Only visible within the boundaries of a code block or file.
      They cease to exist as soon as the program leaves the code block or the program ends
      The do not have a family name aka they do not belong to a package
      (except for subroutines ...)
      my $var1 = 42;
      use strict;
      my $var1 = 42;
    • 57. Packages
      Family where the (global!) variables (incl. subroutines) live (remember $John)
      In general, a namespace is a container that provides context for the identifiers (variable names) it holds, and allows the disambiguation of homonym identifiers residing in different namespaces.
    • 58. Packages
      Family has a:
      name, defined via package declaration
      House, block or blocks of code that follow the package declaration
      package Bio::SeqIO::genbank;
      # welcome to the Bio::SeqIO::genbank family
      sub write_seq{}
      package Bio::SeqIO::fasta;
      # welcome to the Bio::SeqIO::fasta family
      sub write_seq{}
    • 59. Packages
      Why do we need packages ???
      To organize code
      To improve maintainability
      To avoid name space collisions
    • 60. Modules
      What ?
      A text file(with a .pm suffix) containing Perl source code, that can contain any number of namespaces. It must evaluate to a true value.
      At compile time: use <module>
      At run time: require <expr>
      <expr> and <module>:compiler translates each double-colon '::' into a path separator and appends '.pm'.
      E.g. Data::Dumper yields Data/
      use Data::Dumper;
      require Data::Dumper;
      require ‘’;
      require $class;
    • 61. Modules
      A module can contain multiple packages, but convention dictates that each module contains a package of the same name.
      easy to quickly locate the code in any given package (perldoc –m <module>)
      not obligatory !!
      A module name is unique
      1 to 1 mapping to file system !!
      Should start with capital letter
    • 62. Module files
      Module files are stored in a subdirectory hierarchy that parallels the module name hierarchy.
      All module files must have an extension of .pm.
    • 63. Modules
      Module path is relative. So, where is Perl searching for that module ?
      Possible modules roots
      []$ perldoc –V

    • 64. Modules
      Alternative module roots (perldoc -q library)
      In script
      Command line
      use lib ‘/my/alternative/module/path’;
      []$ perl -I/my/alternative/module/path
      export PERL5LIB=$PERL5LIB:/my/alternative/module/path
    • 65. Modules
      package My::Package::Says::Hello;
      sub speak
      print __PACKAGE__, " says: 'Hello'n";
      package My::Package::Says::Blah;
      sub speak
      print __PACKAGE__, " says: 'Blah'n";
      use strict;
      use Test::Speak;
    • 66. Modules
      Why do we need modules???
      To organize packages into files/folders
      Code reuse (ne copy & paste !)
      Module repository: CPAN
      Special module that influences the code (compilation)
      Lexically scoped
    • 67. Modules
      Module information
      In standard distribution: perldoc perlmodlib
      Manually installed: perldoc perllocal
      All modules: perldoc –q installed
      Documentation: perldoc <module name>
      Location: perldoc –l <module name>
      Source: perldoc –m <module name>
    • 68. Packages and Modules - Summary
      A package is a separate namespace within Perl code.
      A module can have more than one package defined within it.
      The default package is main.
      We can get to variables (and subroutines) within packages by using the fully qualified name
      To write a package, just write package <package name> where you want the package to start.
      Package declarations last until the end of the enclosing block, file or until the next package statement
      The require and use keywords can be used to import the contents of other files for use in a program.
      Files which are included must end with a true value.
      Perl looks for modules in a list of directories stored in @INC
      Module names map to the file system
    • 69.
    • 70. Exercises
      Bioperl Training Exercise 1: perldoc
      Bioperl Training Exercise 2: thou shalt not forget
      Bioperl Training Exercise 3: arrays
      Bioperl Training Exercise 4: hashes
      Bioperl Training Exercise 5: packages and modules 1
      Bioperl Training Exercise 6: packages and modules 2
      Bioperl Training Exercise 7: complex data structures
    • 71.
    • 72. Object Oriented Programming in Perl
      Why do we need objects and OOP ?
      It’s fun
      Code reuse
    • 73. Object Oriented Programming in Perl
      What is an object ?
      An object is a (complex) data structure representing a new, user defined type with a collection of behaviors (functions aka methods)
      Collection of attributes
      Developer’s perspective: 3 little make rules
      To create a class, build a package
      To create a method, write a subroutine
      To create an object, bless a referent
    • 74. Rule 1: To create a class, build a package
      Defining a class
      A class is simply a package with subroutines that function as methods. Class name = type = label = namespace
      package Cat;
    • 75. Rule 2: To create a method, write a subroutine
      First argument of methods is always class name or object itself (or rather: reference)
      Subroutine call the OO way (method invocation arrow operator)
      package Cat;
      sub meow {
      my $self = shift;
      print __PACKAGE__ “ says: meow !n”;
    • 76. Rule 3: To create an object, bless a referent
      ‘Special’ method: constructor
      Any name will do, in most cases new
      Object can be anything, in most cases hash
      Reference to object is stored in variable
      Arguments: reference (+ class). Does not change !!
      Underlying referent is blessed (= typed, labelled)
      Returns reference
      package Cat;
      sub new {
      my ($class, @args) = @_;
      my $self = { _name => $_args[0] };
      bless $self, $class;
    • 77. Objects
      Perl objects are data structures ( a collection of attributes).
      To create an object we have to take 3 rules into account:
      Classes are just packages
      Methods are just subroutines
      Blessing a referent creates an object
    • 78. Objects
      Objects are passed around as references
      Calling an object method can be done using the method invocation arrow:
      Constructor functions in Perl are conventionally called new() and can be called by writing:
      $object_ref = ClassName->new()
    • 79. Inheritance
      Way to extend functionality of a class by deriving a (more specific) sub-class from it
      In Perl:
      Way of specifying where to look for methods
      store the name of 1 or more classes in the package variable @ISA
      Multiple inheritance !!
      package NorthAmericanCat;
      use Cat;
      @ISA = qw(Cat);
      package NorthAmericanCat;
      use Cat;
      use Animal;
      @ISA = qw(Cat Animal);
    • 80. Inheritance
      UNIVERSAL, parent of all classes
      Predifined methods
      isa(‘<class name>’): check if the object inherits from a particular class
      can(‘<method name>’): check if <method name> is a callable method
    • 81. Inheritance
      SUPER: superclass of the current package
      start looking in @ISA for a class that can() do_something
      explicitely call a method of a parental class
      often used by Bioperl to initialize object attributes
    • 82. Polymorphism
      methods defined in the base class will override methods defined in the parent classes
      same method has different behaviours
    • 83.
    • 84. Exercises
      Bioperl Training Exercise 8: OOP
      Bioperl Training Exercise 9: inheritance, polymorphism
      Bioperl Training Exercise 10: aggregation, delegation