Functions and modules in python


Published on

Day 4 of an introductory python course for biologists. Theme: functions and modules.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Functions and modules in python

  1. 1. Functions and modules Karin Lagesen
  2. 2. Homework:● Input files are in /projects/temporary/cees-python-course/Karin ● translationtable.txt - tab separated ● dna31.fsa● Script should: ● Open the translationtable.txt file and read it into a dictionary ● Open the dna31.fsa file and read the contents. ● Translates the DNA into protein using the dictionary ● Prints the translation in a fasta format to the file TranslateProtein.fsa. Each protein line should be 60 characters long.
  3. 3. Modularization● Programs can get big● Risk of doing the same thing many times● Functions and modules encourage ● re-usability ● readability ● helps with maintenance
  4. 4. Functions● Most common way to modularize a program● Takes values as parameters, executes code on them, returns results● Functions also found builtin to Python: ● open(filename, mode) ● sum([list of numbers]● These do something on their parameters, and returns the results
  5. 5. Functions – how to define def FunctionName(param1, param2, ...): """ Optional Function desc (Docstring) """ FUNCTION CODE ... return DATA● keyword: def – says this is a function● functions need names● parameters are optional, but common● docstring useful, but not mandatory● FUNCTION CODE does something● keyword return results: return
  6. 6. Function example >>> def hello(name): ... results = "Hello World to " + name + "!" ... return results ... >>> hello() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: hello() takes exactly 1 argument (0 given) >>> hello("Lex") Hello World to Lex! >>>● Task: make script from this – take name from command line● Print results to screen
  7. 7. Function example scriptimport sysdef hello(name): results = "Hello World to " + name + "!" return resultsname = sys.argv[1]functionresult = hello(name)print functionresult[karinlag@freebee]% python hello.pyTraceback (most recent call last): File "", line 8, in ? name = sys.argv[1]IndexError: list index out of range[karinlag@freebee]% python LexHello World to Lex![karinlag@freebee]%
  8. 8. Returning values● Returning is not mandatory, if no return, None is returned by default● Can return more than one value - results will be shown as a tuple >>> def test(x, y): ... a = x*y ... return x, a ... >>> test(1,2) (1, 2) >>>
  9. 9. Function scope● Variables defined inside a function can only be seen there!● Access the value of variables defined inside of function: return variable
  10. 10. Scope example>>> def test(x):... z = 10... print "the value of z is " + str(z)... return x*2...>>> z = 50>>> test(3)the value of z is 106>>> z50>>> xTraceback (most recent call last): File "<stdin>", line 1, in <module>NameError: name x is not defined>>>
  11. 11. Parameters● Functions can take parameters – not mandatory● Parameters follow the order in which they are given >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: int and str >>>
  12. 12. Named parameters● Can use named parameters >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: int and str >>> test(y="y", x=2) 4 y2 >>>
  13. 13. Default parameters● Parameters can be given a default value● With default, parameter does not have to be specified, default will be used● Can still name parameter in parameter list>>> def hello(name = "Everybody"):... results = "Hello World to " + name + "!"... return results...>>> hello("Anna")Hello World to Anna!>>> hello()Hello World to Everybody!>>> hello(name = "Annette")Hello World to Annette!>>>
  14. 14.● Use script from homework● Create the following functions: ● get_translation_table(filename) – return dict with codons and protein codes ● read_dna_string(filename) – return tuple with (descr, DNA_string) ● translate_protein(dictionary, DNA_string) – return the protein version of the DNA string ● pretty_print(descr, protein_string, outname) – write result to outname in fasta format
  15. 15. TranslateProteinFunctions.pyimport sysYOUR CODE GOES HERE!!!!translationtable = sys.argv[1]fastafile = sys.argv[2]outfile = sys.argv[3]translation_dict = get_translation_table(translationtable)description, DNA_string = read_dna_string(fastafile)protein_string = translate_protein(translation_dict, DNA_string)pretty_print(description, protein_string, outfile)
  16. 16. get_translation_tabledef get_translation_table(translationtable): fh = open(translationtable.txt , r) trans_dict = {} for line in fh: codon = line.split()[0] aa = line.split()[1] trans_dict[codon] = aa fh.close() return trans_dict
  17. 17. read_dna_stringdef read_dna_string(fastafile): fh = open(fastafile, "r") line = fh.readline() header_line = line[1:-1] seq = "" for line in fh: seq += line[:-1] fh.close() return (header_line, seq)
  18. 18. translate_proteindef translate_protein(translation_dict, DNA_string): aa_seq = "" for i in range(0, len(DNA_string)-3, 3): codon = DNA_string[i:i+3] one_letter = translation_dict[codon] aa_seq += one_letter return aa_seq
  19. 19. pretty_printdef pretty_print(description, protein_string, outfile): fh = open(outfile, "w") fh.write(">" + description + "n") for i in range(0, len(protein_string), 60): fh.write(protein_string[i:i+60] + "n") fh.close()
  20. 20. Modules● A module is a file with functions, constants and other code in it● Module name = filename without .py● Can be used inside another program● Needs to be import-ed into program● Lots of builtin modules: sys, os, os.path....● Can also create your own
  21. 21. Using module● One of two import statements: 1: import modulename 2: from module import function/constant● If method 1: ● modulename.function(arguments)● If method 2: ● function(arguments) – module name not needed ● beware of function name collision
  22. 22. Operating system modules – os and os.path● Modules dealing with files and operating system interaction● Commonly used methods: ● os.getcwd() - get working directory ● os.chdir(path) – change working directory ● os.listdir([dir = .]) - get a list of all files in this directory ● os.mkdir(path) – create directory ● os.path.join(dirname, dirname/filename...)
  23. 23. Your own modules● Three steps: 1. Create file with functions in it. Module name is same as filename without .py 2. In other script, do import modulename 3. In other script, use function like this: modulename.functionname(args)
  24. 24. Separating module use and main use● Files containing python code can be: ● script file ● module file● Module functions can be used in scripts● But: modules can also be scripts● Question is – how do you know if the code is being executed in the module script or an external script?
  25. 25. Module use / main use● When a script is being run, within that script a variable called __name__ will be set to the string “__main__”● Can test on this string to see if this script is being run● Benefit: can define functions in script that can be used in module mode later
  26. 26. Module mode / main modeimport sys<code as before> When this script is being used,translationtable = sys.argv[1] this will always run, no matter what!fastafile = sys.argv[2]outfile = sys.argv[3]translation_dict = get_translation_table(translationtable)description, DNA_string = read_dna_string(fastafile)protein_string = translate_protein(translation_dict, DNA_string)pretty_print(description, protein_string, outfile)
  27. 27. Module use / main use# this is a scriptimport sysimport TranslateProteinFunctionsdescription, DNA_string = read_dna_string(sys.argv[1])print description[karinlag@freebee]% python dna31.fsaTraceback (most recent call last): File "", line 2, in ? import TranslateProteinFunctions File "", line 44, in ? fastafile = sys.argv[2]IndexError: list index out of range[karinlag@freebee]Karin%
  28. 28. with mainimport sys<code as before>if __name__ == “__main__”: translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  29. 29.● Create a script that has the following: ● function get_fastafiles(dirname) – gets all the files in the directory, checks if they are fasta files (end in .fsa), returns list of fasta files – hint: you need os.path to create full relative file names ● function concat_fastafiles(filelist, outfile) – takes a list of fasta files, opens and reads each of them, writes them to outfile ● if __name__ == “__main__”: – do what needs to be done to run script● Remember imports!