Hashes and @ARGV Paolo Marcatili - Programmazione 08-09
Agenda Today we will see how to Summary of what we’ve done Hashes Keys Paolo Marcatili - Programmazione 08-09
Task Today Paolo Marcatili - Programmazione 08-09
parsing Parse a GO  file Extract gene names and function Paolo Marcatili - Programmazione 08-09
Hashes Paolo Marcatili - Programmazione 08-09
Hashes Hashes  are like  array , they store collections of scalars ... but unlike arrays, indexing is by name (just like in real life!!!) Two components to each hash entry: Key   example : name Value   example : phone number Hashes denoted with % Example : %phoneDirectory Elements are accessed using  {}  (like [] in arrays) Paolo Marcatili - Programmazione 08-09
Hashes continued ... Adding a new key-value pair $phoneDirectory{“Shirly”} =  7267975 Note the  $  to specify “scalar” context! Each key can have only one value $phoneDirectory{“Shirly”} = 7265797 #  overwrites previous assignment Multiple keys can have the same value Accessing the value of a key $phoneNumber =$phoneDirectory{“Shirly”}; Paolo Marcatili - Programmazione 08-09
Hashes and Foreach Foreach works in hashes as well! foreach $person (keys (%phoneDirectory) )   { print “$person: $phoneDirectory{$person}”; } Never depend on the  order you put key/values in the hash!  Perl has its own magic to make hashes amazingly fast!! Paolo Marcatili - Programmazione 08-09
Hashes and Sorting The sort function works with hashes as well  Sorting on the keys foreach $person ( sort  keys %phoneDirectory) { print “$person : $directory{$person}\n”; } This will print the phoneDirectory hash table in alphabetical order based on the name of the person,  i.e.  the key. Paolo Marcatili - Programmazione 08-09
Hash and Sorting cont... Sorting by value foreach $person (sort {$phoneDirectory{$a} <=> $phoneDirectory{$b}} keys %phoneDirectory) { print “$person :    $phoneDirectory{$person}\n”; } Prints the person and their phone number in the order of their respective phone numbers, i.e.  the value. Paolo Marcatili - Programmazione 08-09
Exercise Chose your own test or use  wget  “ http://www.quirinale.it/costituzione/costituzione.htm ” Identify the 10 most frequent words Identify the 10 most frequent words longer than 5 letters Paolo Marcatili - Programmazione 08-09
Counting Words my %seen; while (my $l=<F>){ my  @w=split (/\s+/, $l);# questa è una funzione nuova… foreach my $word (@w){ $word=~s/[sx]$//;#plurial elimination $seen { $word } ++; } } foreach my $word (sort {$seen{$a}<=>$seen{$b}} keys %seen){ print “Word $word N: $seen{$word}\n”; } Paolo Marcatili - Programmazione 08-09
@ARGV Paolo Marcatili - Programmazione 08-09
Command Line Arguments Command line arguments in Perl are extremely easy. @ARGV is the array that holds all arguments passed in from the command line. Example:  %  ./prog.pl arg1 arg2 arg3 @ARGV would contain ('arg1', arg2', 'arg3) $#ARGV returns the number of command line arguments that have been passed.  Remember $#array is the size of the array! Paolo Marcatili - Programmazione 08-09
Quick Program with @ARGV Simple program called log.pl that takes in a number and prints the log base 2 of that number; #!/usr/local/bin/perl -w $log = log($ARGV[0]) / log(2); print “The log base 2 of $ARGV[0] is $log.\n”; Run the program as follows: % log.pl 8 This will return the following: The log base 2 of 8 is  3. Paolo Marcatili - Programmazione 08-09
$_ Perl default scalar value that is used when a variable is not explicitly specified. Can be used in For Loops File Handling Regular Expressions  Paolo Marcatili - Programmazione 08-09
$_ and For Loops Example using $_ in a for loop @array = ( “Perl”, “C”, “Java” ); for(@array) { print $_ . “is a language I know\n”; } Output : Perl is a language I know. C is a language I know. Java is a language I know. Paolo Marcatili - Programmazione 08-09
$_ and File Handlers Example in using $_ when reading in a file; while( <> ) { chomp $_;  #  remove the newline char  @array = split/ /, $_;  #  split the line on white space    #  and stores  data in an array } Note: The line read in from the file is automatically store in the default scalar variable $_ Paolo Marcatili - Programmazione 08-09
Opendir, readdir Paolo Marcatili - Programmazione 08-09
Opendir & readdir Just like open, but for dirs # load all files of the &quot;data/&quot; folder into the @files array opendir(DIR, ”$ARGV[0]&quot;); @files = readdir(DIR); closedir(DIR); # build a unsorted list from the  @files array: print &quot;<ul>&quot;; foreach $file (@files) { next if ($file eq &quot;.&quot; or $file eq &quot;..&quot;); print &quot;<li><a href=\&quot;$file\&quot;>$file</a></li>&quot;; } print &quot;</ul>&quot;; Paolo Marcatili - Programmazione 08-09
Task Paolo Marcatili - Programmazione 08-09
Malaria proteins annotations We’ll do a script that reports all gene with a similar function w.r.t. our query gene. http://www. biocomputing . it/master_hashes Paolo Marcatili - Programmazione 08-09

Hashes Master

  • 1.
    Hashes and @ARGVPaolo Marcatili - Programmazione 08-09
  • 2.
    Agenda Today wewill see how to Summary of what we’ve done Hashes Keys Paolo Marcatili - Programmazione 08-09
  • 3.
    Task Today PaoloMarcatili - Programmazione 08-09
  • 4.
    parsing Parse aGO file Extract gene names and function Paolo Marcatili - Programmazione 08-09
  • 5.
    Hashes Paolo Marcatili- Programmazione 08-09
  • 6.
    Hashes Hashes are like array , they store collections of scalars ... but unlike arrays, indexing is by name (just like in real life!!!) Two components to each hash entry: Key example : name Value example : phone number Hashes denoted with % Example : %phoneDirectory Elements are accessed using {} (like [] in arrays) Paolo Marcatili - Programmazione 08-09
  • 7.
    Hashes continued ...Adding a new key-value pair $phoneDirectory{“Shirly”} = 7267975 Note the $ to specify “scalar” context! Each key can have only one value $phoneDirectory{“Shirly”} = 7265797 # overwrites previous assignment Multiple keys can have the same value Accessing the value of a key $phoneNumber =$phoneDirectory{“Shirly”}; Paolo Marcatili - Programmazione 08-09
  • 8.
    Hashes and ForeachForeach works in hashes as well! foreach $person (keys (%phoneDirectory) ) { print “$person: $phoneDirectory{$person}”; } Never depend on the order you put key/values in the hash! Perl has its own magic to make hashes amazingly fast!! Paolo Marcatili - Programmazione 08-09
  • 9.
    Hashes and SortingThe sort function works with hashes as well Sorting on the keys foreach $person ( sort keys %phoneDirectory) { print “$person : $directory{$person}\n”; } This will print the phoneDirectory hash table in alphabetical order based on the name of the person, i.e. the key. Paolo Marcatili - Programmazione 08-09
  • 10.
    Hash and Sortingcont... Sorting by value foreach $person (sort {$phoneDirectory{$a} <=> $phoneDirectory{$b}} keys %phoneDirectory) { print “$person : $phoneDirectory{$person}\n”; } Prints the person and their phone number in the order of their respective phone numbers, i.e. the value. Paolo Marcatili - Programmazione 08-09
  • 11.
    Exercise Chose yourown test or use wget “ http://www.quirinale.it/costituzione/costituzione.htm ” Identify the 10 most frequent words Identify the 10 most frequent words longer than 5 letters Paolo Marcatili - Programmazione 08-09
  • 12.
    Counting Words my%seen; while (my $l=<F>){ my @w=split (/\s+/, $l);# questa è una funzione nuova… foreach my $word (@w){ $word=~s/[sx]$//;#plurial elimination $seen { $word } ++; } } foreach my $word (sort {$seen{$a}<=>$seen{$b}} keys %seen){ print “Word $word N: $seen{$word}\n”; } Paolo Marcatili - Programmazione 08-09
  • 13.
    @ARGV Paolo Marcatili- Programmazione 08-09
  • 14.
    Command Line ArgumentsCommand line arguments in Perl are extremely easy. @ARGV is the array that holds all arguments passed in from the command line. Example: % ./prog.pl arg1 arg2 arg3 @ARGV would contain ('arg1', arg2', 'arg3) $#ARGV returns the number of command line arguments that have been passed. Remember $#array is the size of the array! Paolo Marcatili - Programmazione 08-09
  • 15.
    Quick Program with@ARGV Simple program called log.pl that takes in a number and prints the log base 2 of that number; #!/usr/local/bin/perl -w $log = log($ARGV[0]) / log(2); print “The log base 2 of $ARGV[0] is $log.\n”; Run the program as follows: % log.pl 8 This will return the following: The log base 2 of 8 is 3. Paolo Marcatili - Programmazione 08-09
  • 16.
    $_ Perl defaultscalar value that is used when a variable is not explicitly specified. Can be used in For Loops File Handling Regular Expressions Paolo Marcatili - Programmazione 08-09
  • 17.
    $_ and ForLoops Example using $_ in a for loop @array = ( “Perl”, “C”, “Java” ); for(@array) { print $_ . “is a language I know\n”; } Output : Perl is a language I know. C is a language I know. Java is a language I know. Paolo Marcatili - Programmazione 08-09
  • 18.
    $_ and FileHandlers Example in using $_ when reading in a file; while( <> ) { chomp $_; # remove the newline char @array = split/ /, $_; # split the line on white space # and stores data in an array } Note: The line read in from the file is automatically store in the default scalar variable $_ Paolo Marcatili - Programmazione 08-09
  • 19.
    Opendir, readdir PaoloMarcatili - Programmazione 08-09
  • 20.
    Opendir & readdirJust like open, but for dirs # load all files of the &quot;data/&quot; folder into the @files array opendir(DIR, ”$ARGV[0]&quot;); @files = readdir(DIR); closedir(DIR); # build a unsorted list from the @files array: print &quot;<ul>&quot;; foreach $file (@files) { next if ($file eq &quot;.&quot; or $file eq &quot;..&quot;); print &quot;<li><a href=\&quot;$file\&quot;>$file</a></li>&quot;; } print &quot;</ul>&quot;; Paolo Marcatili - Programmazione 08-09
  • 21.
    Task Paolo Marcatili- Programmazione 08-09
  • 22.
    Malaria proteins annotationsWe’ll do a script that reports all gene with a similar function w.r.t. our query gene. http://www. biocomputing . it/master_hashes Paolo Marcatili - Programmazione 08-09