Functions and modules in python
Upcoming SlideShare
Loading in...5
×
 

Functions and modules in python

on

  • 1,610 views

Day 4 of an introductory python course for biologists. Theme: functions and modules.

Day 4 of an introductory python course for biologists. Theme: functions and modules.

Statistics

Views

Total Views
1,610
Views on SlideShare
1,610
Embed Views
0

Actions

Likes
0
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Functions and modules in python Functions and modules in python Presentation Transcript

  • Functions and modules Karin Lagesen karin.lagesen@bio.uio.no
  • Homework: TranslateProtein.py● Input files are in /projects/temporary/cees-python-course/Karin ● translationtable.txt - tab separated ● dna31.fsa● Script should: ● Open the translationtable.txt file and read it into a dictionary ● Open the dna31.fsa file and read the contents. ● Translates the DNA into protein using the dictionary ● Prints the translation in a fasta format to the file TranslateProtein.fsa. Each protein line should be 60 characters long.
  • Modularization● Programs can get big● Risk of doing the same thing many times● Functions and modules encourage ● re-usability ● readability ● helps with maintenance
  • Functions● Most common way to modularize a program● Takes values as parameters, executes code on them, returns results● Functions also found builtin to Python: ● open(filename, mode) ● sum([list of numbers]● These do something on their parameters, and returns the results
  • Functions – how to define def FunctionName(param1, param2, ...): """ Optional Function desc (Docstring) """ FUNCTION CODE ... return DATA● keyword: def – says this is a function● functions need names● parameters are optional, but common● docstring useful, but not mandatory● FUNCTION CODE does something● keyword return results: return
  • Function example >>> def hello(name): ... results = "Hello World to " + name + "!" ... return results ... >>> hello() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: hello() takes exactly 1 argument (0 given) >>> hello("Lex") Hello World to Lex! >>>● Task: make script from this – take name from command line● Print results to screen
  • Function example scriptimport sysdef hello(name): results = "Hello World to " + name + "!" return resultsname = sys.argv[1]functionresult = hello(name)print functionresult[karinlag@freebee]% python hello.pyTraceback (most recent call last): File "hello.py", line 8, in ? name = sys.argv[1]IndexError: list index out of range[karinlag@freebee]% python hello.py LexHello World to Lex![karinlag@freebee]%
  • Returning values● Returning is not mandatory, if no return, None is returned by default● Can return more than one value - results will be shown as a tuple >>> def test(x, y): ... a = x*y ... return x, a ... >>> test(1,2) (1, 2) >>>
  • Function scope● Variables defined inside a function can only be seen there!● Access the value of variables defined inside of function: return variable
  • Scope example>>> def test(x):... z = 10... print "the value of z is " + str(z)... return x*2...>>> z = 50>>> test(3)the value of z is 106>>> z50>>> xTraceback (most recent call last): File "<stdin>", line 1, in <module>NameError: name x is not defined>>>
  • Parameters● Functions can take parameters – not mandatory● Parameters follow the order in which they are given >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: int and str >>>
  • Named parameters● Can use named parameters >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: int and str >>> test(y="y", x=2) 4 y2 >>>
  • Default parameters● Parameters can be given a default value● With default, parameter does not have to be specified, default will be used● Can still name parameter in parameter list>>> def hello(name = "Everybody"):... results = "Hello World to " + name + "!"... return results...>>> hello("Anna")Hello World to Anna!>>> hello()Hello World to Everybody!>>> hello(name = "Annette")Hello World to Annette!>>>
  • ExerciseTranslateProteinFunctions.py● Use script from homework● Create the following functions: ● get_translation_table(filename) – return dict with codons and protein codes ● read_dna_string(filename) – return tuple with (descr, DNA_string) ● translate_protein(dictionary, DNA_string) – return the protein version of the DNA string ● pretty_print(descr, protein_string, outname) – write result to outname in fasta format
  • TranslateProteinFunctions.pyimport sysYOUR CODE GOES HERE!!!!translationtable = sys.argv[1]fastafile = sys.argv[2]outfile = sys.argv[3]translation_dict = get_translation_table(translationtable)description, DNA_string = read_dna_string(fastafile)protein_string = translate_protein(translation_dict, DNA_string)pretty_print(description, protein_string, outfile)
  • get_translation_tabledef get_translation_table(translationtable): fh = open(translationtable.txt , r) trans_dict = {} for line in fh: codon = line.split()[0] aa = line.split()[1] trans_dict[codon] = aa fh.close() return trans_dict
  • read_dna_stringdef read_dna_string(fastafile): fh = open(fastafile, "r") line = fh.readline() header_line = line[1:-1] seq = "" for line in fh: seq += line[:-1] fh.close() return (header_line, seq)
  • translate_proteindef translate_protein(translation_dict, DNA_string): aa_seq = "" for i in range(0, len(DNA_string)-3, 3): codon = DNA_string[i:i+3] one_letter = translation_dict[codon] aa_seq += one_letter return aa_seq
  • pretty_printdef pretty_print(description, protein_string, outfile): fh = open(outfile, "w") fh.write(">" + description + "n") for i in range(0, len(protein_string), 60): fh.write(protein_string[i:i+60] + "n") fh.close()
  • Modules● A module is a file with functions, constants and other code in it● Module name = filename without .py● Can be used inside another program● Needs to be import-ed into program● Lots of builtin modules: sys, os, os.path....● Can also create your own
  • Using module● One of two import statements: 1: import modulename 2: from module import function/constant● If method 1: ● modulename.function(arguments)● If method 2: ● function(arguments) – module name not needed ● beware of function name collision
  • Operating system modules – os and os.path● Modules dealing with files and operating system interaction● Commonly used methods: ● os.getcwd() - get working directory ● os.chdir(path) – change working directory ● os.listdir([dir = .]) - get a list of all files in this directory ● os.mkdir(path) – create directory ● os.path.join(dirname, dirname/filename...)
  • Your own modules● Three steps: 1. Create file with functions in it. Module name is same as filename without .py 2. In other script, do import modulename 3. In other script, use function like this: modulename.functionname(args)
  • Separating module use and main use● Files containing python code can be: ● script file ● module file● Module functions can be used in scripts● But: modules can also be scripts● Question is – how do you know if the code is being executed in the module script or an external script?
  • Module use / main use● When a script is being run, within that script a variable called __name__ will be set to the string “__main__”● Can test on this string to see if this script is being run● Benefit: can define functions in script that can be used in module mode later
  • Module mode / main modeimport sys<code as before> When this script is being used,translationtable = sys.argv[1] this will always run, no matter what!fastafile = sys.argv[2]outfile = sys.argv[3]translation_dict = get_translation_table(translationtable)description, DNA_string = read_dna_string(fastafile)protein_string = translate_protein(translation_dict, DNA_string)pretty_print(description, protein_string, outfile)
  • Module use / main use# this is a scriptimport sysimport TranslateProteinFunctionsdescription, DNA_string = read_dna_string(sys.argv[1])print description[karinlag@freebee]% python modtest.py dna31.fsaTraceback (most recent call last): File "modtest.py", line 2, in ? import TranslateProteinFunctions File "TranslateProteinFunctions.py", line 44, in ? fastafile = sys.argv[2]IndexError: list index out of range[karinlag@freebee]Karin%
  • TranslateProteinFuctions.py with mainimport sys<code as before>if __name__ == “__main__”: translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • ConcatFasta.py● Create a script that has the following: ● function get_fastafiles(dirname) – gets all the files in the directory, checks if they are fasta files (end in .fsa), returns list of fasta files – hint: you need os.path to create full relative file names ● function concat_fastafiles(filelist, outfile) – takes a list of fasta files, opens and reads each of them, writes them to outfile ● if __name__ == “__main__”: – do what needs to be done to run script● Remember imports!