• Like
  • Save
Pyt 1112
Upcoming SlideShare
Loading in...5
×
 

Pyt 1112

on

  • 2,103 views

an intro to Python

an intro to Python

Statistics

Views

Total Views
2,103
Views on SlideShare
2,102
Embed Views
1

Actions

Likes
0
Downloads
5
Comments
0

1 Embed 1

http://www.slashdocs.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Pyt 1112 Pyt 1112 Presentation Transcript

    • MSc BIOINFO2011-2012 Introduction to Python MSc Bioinformatics for Health Sciences César L. Ávila1 and Jordi Villà i Freixa2 1 clavila@gmail.com Universidad Nacional de Tucumán 2 jordi.villa@upf.edu // cbbl.imim.es Universitat Pompeu Fabra c Michael A. Johnston 2007; Jordi Villà-Freixa 2007-2012 Barcelona, Winter 2012 1
    • MSc BIOINFO 1 Intro2011-2012 2 Functional programmingIntro 3 ClassesFunctionalprogram-ming 4 ExceptionsClasses 5 BioPythonExceptionsBioPython 6 GraphicsGraphics 7 CGI scriptingCGIscripting 8 PackagingPackagingExtensions 9 ExtensionsGlossary 10 GlossaryAnnexes 11 Annexes
    • Programming MSc BIOINFO2011-2012 Programs operate on various "data types": integers, strings, doublesIntro Concept of variable and assignment:Functionalprogram- Age = 3mingClasses Expresions create and process data:ExceptionsBioPython x>3Graphics y=x∗2CGIscripting Control of flow: conditioning testing (if, else) andPackaging iterations (for, while loops)Extensions Procedural programming: using functions to divide yourGlossary program into logical chunksAnnexes Basic programming concepts! Only syntax change. 3
    • Python MSc BIOINFO2011-2012Intro Dynamical, interpreted, object oriented programmingFunctional languageprogram-ming Software quality: designed to be readable, coherent andClasses maintainableExceptionsBioPython Developer productivity: very compact code (20-33% ofGraphics the size of the corresponding java/C code): less code →CGI less to debug → less to maintain → less to learnscriptingPackaging Some help: http://greenteapress.com/Extensions thinkpython/thinkpython.htmlGlossaryAnnexes 4
    • The subject MSc BIOINFO2011-2012Intro About making it quicker for you and others to write, maintainFunctionalprogram- and extend programs. To do so:mingClasses Reduce the time spent in programming & debugging:Exceptions OOP, testingBioPython Make it easy to extend your program: code reuse (OOP)GraphicsCGI Reduce the time for others to understand your program:scripting documentation, program readabilityPackagingExtensionsGlossaryAnnexes 5
    • The Python interpreter MSc BIOINFO2011-2012 Python 2.6.6 (r266:84292, Jan 3 2011, 14:28:29)Intro [GCC 4.2.1 (Apple Inc. build 5664)] on darwinFunctionalprogram- Type "help", "copyright", "credits" or "license"ming >>> print 1 + 1Classes 2ExceptionsBioPython Alternatively, you can store code in a file and use theGraphics interpreter to execute the contents of the file. Such a file isCGI called a script. For example, you could use a text editor toscripting create a fle named dinsdale.py with the following contents:PackagingExtensions print 1 + 1Glossary By convention, Python scripts have names that end with .py.Annexes 6
    • IDLE or other IDEs MSc BIOINFO2011-2012 IDLE is the Python Integrated Development Environment: http://docs.python.org/library/idle.htmlIntro First step in making it easier to write Python codeFunctional Syntax highlightingprogram-ming Code completionClasses Inline documentationExceptions Many other useful featuresBioPython Eclipse is... "an open extensible IDE for anything andGraphics nothing in particular". Extension to Python throughCGIscripting PyDEVPackaging iPython aims... "to create a comprehensive environmentExtensions for interactive and exploratory computing"Glossary but check also other possibilities:Annexes http://wiki.python.org/moin/PythonEditors
    • Structure of a program MSc BIOINFO2011-2012Intro Programs are composed of modulesFunctional Modules contain statements:program-ming Function definitionsClasses Control statements (if, while, etc)Exceptions Variable assignmentsBioPython Statements contain expressions:Graphics x<3CGIscripting a=x∗x+2PackagingExtensions Expressions create and process objectsGlossaryAnnexes 8
    • Python keywords MSc BIOINFO2011-2012IntroFunctionalprogram- and del from not whileming as elif global or withClasses assert else if pass yieldExceptions break except import printBioPython class exec in raiseGraphics continue finally is return def for lambda tryCGIscriptingPackagingExtensionsGlossaryAnnexes 9
    • Numbers MSc BIOINFO2011-2012Intro Types: integer, floating point, long integers, bool (True,Functionalprogram- False)ming Basic expression operators & precedenceClasses http://www.ibiblio.org/g2swap/Exceptions byteofpython/read/operator-precedence.htmlBioPythonGraphics Conversion: mixed types are converted up, e.g., IntegersCGI → floating pointscriptingPackaging 40+3.14ExtensionsGlossaryAnnexes 10
    • Dynamic typing MSc BIOINFO Variable types are decided at runtime2011-2012 Variables are created when you assign values to themIntro Variables refer to an object, e.g., a numberFunctional The object has a type; the variable does notprogram-ming When a variable appears in an expression, it isClasses immediately replaced by the object it refers toExceptionsBioPython ExampleGraphicsCGI a=3scriptingPackaging Crerate an object of type integer that represents theExtensions number 3GlossaryAnnexes Create variable a if it does not exist yet link the variable a to the new object 3 11
    • Dynamic typing MSc BIOINFO2011-2012IntroFunctionalprogram-mingClassesExceptionsBioPythonGraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 12
    • Dynamic typing MSc BIOINFO2011-2012IntroFunctionalprogram-mingClassesExceptionsBioPythonGraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 12
    • Garbage collection MSc BIOINFO2011-2012 When no variables are left that reference an object, it is destroyed (automatic memory management)IntroFunctionalprogram-mingClassesExceptionsBioPythonGraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 13
    • Modules MSc BIOINFO Every file containing python code whose name ends in2011-2012 .py is a moduleIntro A module usually contains a number of items e.g.Functional Variables and functions which you can access. Theseprogram-ming items are called attributesClasses You load a module using the import statementExceptions Just like a number a module is an objectBioPython You can reload a module after changing it using theGraphics reload() functionCGIscripting You access modules attributes using the . operator:Packaging myModule.myAttributeExtensions Modules are the highest level way of organising yourGlossary programAnnexes Large programs have multiple module files each of which contains related code 14
    • Documentation MSc BIOINFO2011-2012 Documentation is one of the core parts of goodIntro programmingFunctional Python contains an inbuilt documentation mechanismprogram-ming using "doc strings"Classes For modules the doc string is the first string in the moduleExceptions file.BioPythonGraphics Doc strings must be enclosed in triple quotes.CGI A modules doc string is accessible through an attributescripting called __doc__Packaging >>> import osExtensions >>> os.access.__doc__ ’access(path, mode) -> 1 if granted, 0 otherwiseGlossary Test for access to a file.’Annexes >>> 15
    • More docstring examples MSc BIOINFO2011-2012 def phase_of_the_moon ( ) :Intro " T h i s f u n c t i o n r e t u r n s a s l i g h t l y randomizedFunctional i n t e g e r t h a t s h u f f l e s data around i n a wayprogram-ming c o n v e n i e n t f o r t h e XYZ c l a s s . " # Working code here .Classes return valueExceptionsBioPython def something ( ) :Graphics " T h i s i s a f i r s t paragraph .CGIscripting T h i s i s a second paragraph . The i n t e r v e n i n gPackaging b l a n k l i n e means t h i s must be a new paragraph . "Extensions # ...GlossaryAnnexes 16
    • Module attributes MSc BIOINFO2011-2012 __doc__ is one of four special module attributesIntro The others are:Functionalprogram- __name__ - The module nameming __file__ - The modules file name (complete path)Classes __builtin__ - Ignore for now.ExceptionsBioPython All special names in python begin and end with __Graphics You can see all the attributes of a module using theCGI dir() function, which returns a list data type - more onscripting lists laterPackagingExtensions dir returns a list of the attributes and methods of any object:Glossary modules, functions, strings, lists, dictionaries...Annexes 17
    • The import search path MSc BIOINFO2011-2012 >>> import sys >>> sys . pathIntro [ ’ ’ , ’ / u s r / l o c a l / l i b / python2 . 2 ’ , ’ / u s r / l o c a l / l i b / python2 . 2Functional ’ / u s r / l o c a l / l i b / python2 . 2 / l i b −dynload ’ , ’ / u s r / l o c a l / l i b / pyprogram-ming ’ / u s r / l o c a l / l i b / python2 . 2 / s i t e −packages / PIL ’ ,Classes ’ / u s r / l o c a l / l i b / python2 . 2 / s i t e −packages / p i d d l e ’ ]Exceptions >>> sys <module ’ sys ’ ( b u i l t −i n ) >BioPython >>> sys . path . append ( ’ / my / new / path ’ )GraphicsCGIscripting import sys , os p r i n t ’ sys . argv [ 0 ] = ’ , sys . argv [ 0 ]Packaging pathname = os . path . dirname ( sys . argv [ 0 ] )Extensions p r i n t ’ path = ’ , pathnameGlossary p r i n t ’ f u l l path = ’ , os . path . abspath ( pathname )Annexes 18
    • Function basics MSc BIOINFO http://docs.python.org/library/functions.html2011-2012 We have already seen two functions - reload() &Intro dir()Functionalprogram- Functions are defined using the def statementming The return statement sends a functions result back toClasses the caller.ExceptionsBioPython All code that is in the function must be indentedGraphics The function ends when the indentation level is the sameCGIscripting as the def statement that created it.Packaging The functions arguments are given in brackets after theExtensions nameGlossary Note you do not declare types in the argument list!Annexes You can use any object as the arguments to a function: e.g. Numbers, modules and even other functions! 19
    • An example function MSc BIOINFO2011-2012IntroFunctionalprogram- def m u l t ( a , b ) :ming i f b == 0 :Classes return 0Exceptions r e s t = mult ( a , b − 1)BioPython value = a + r e s tGraphics return value print " 3 ∗ 2 = " , mult (3 , 2)CGIscriptingPackagingExtensionsGlossaryAnnexes 20
    • Recursivity MSc BIOINFO2011-2012IntroFunctionalprogram- ExamplemingClasses Write a function for the factorial of a numberExceptionsBioPython ExampleGraphics Write a function for counting down from a given integerCGIscriptingPackagingExtensionsGlossaryAnnexes
    • Factorial MSc BIOINFO2011-2012IntroFunctional def f a c t o r i a l ( n ) :program- i f n <= 1 :ming return 1Classes return n ∗ f a c t o r i a l ( n − 1)ExceptionsBioPython print " 2! =" , factorial (2)Graphics print " 3! =" , factorial (3)CGI print " 4! =" , factorial (4)scripting print " 5! =" , factorial (5)PackagingExtensionsGlossaryAnnexes 22
    • Countdown MSc BIOINFO2011-2012IntroFunctionalprogram- def count_down ( n ) :ming print nClasses i f n > 0:Exceptions r e t u r n count_down ( n−1)BioPythonGraphics count_down ( 5 )CGIscriptingPackagingExtensionsGlossaryAnnexes 23
    • More on functions MSc BIOINFO2011-2012 The function is not created until def is executedIntro Like numbers and modules, functions are objectsFunctionalprogram- When def executes it creates a function object andming associates a name with it.ClassesExceptionsBioPythonGraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 24
    • Argument values MSc BIOINFO2011-2012 def ask_ok ( prompt , r e t r i e s =4 , c o m p l a i n t = ’ Yes o r no , pleaseIntro while True :Functional ok = r a w _ i n p u t ( prompt )program- i f ok i n ( ’ y ’ , ’ ye ’ , ’ yes ’ ) :ming r e t u r n TrueClasses i f ok i n ( ’ n ’ , ’ no ’ , ’ nop ’ , ’ nope ’ ) :Exceptions r e t u r n FalseBioPython retries = retries − 1Graphics i f r e t r i e s < 0:CGI r a i s e I O E r r o r ( ’ r e f u s e n i k user ’ )scripting print complaintPackaging ask_ok ( ’ Do you r e a l l y want t o q u i t ? ’ )Extensions ask_ok ( ’OK t o o v e r w r i t e t h e f i l e ? ’ , 2 )Glossary ask_ok ( ’OK t o o v e r w r i t e t h e f i l e ? ’ , 2 , ’Come on , o n l y yesAnnexes 25
    • Lambda forms MSc BIOINFO2011-2012 Lambda forms can be used wherever function objects areIntro required. They are syntactically restricted to a singleFunctional expression.program-ming >>> def make_incrementor ( n ) :Classes ... r e t u r n lambda x : x + nExceptions ...BioPython >>> f = make_incrementor ( 4 2 ) >>> f (0)Graphics 42CGIscripting >>> f (1)Packaging 43Extensions http://www.secnetix.de/olli/Python/lambda_Glossary functions.hawkAnnexes 26
    • Function documentation MSc BIOINFO2011-2012IntroFunctionalprogram- Like modules functions can also have doc strings.mingClasses The doc string is the first string after the functionExceptions definition.BioPython It must be enclosed in triple quotes ”’ ”’.GraphicsCGI It is accessible through the attribute __doc__.scriptingPackagingExtensionsGlossaryAnnexes 27
    • Objects and attributes MSc BIOINFO2011-2012 In python everything is an object.Intro NumbersFunctionalprogram- Functionsming ModulesClassesExceptions In python all objects have attributesBioPython The dir() function lists the attributes of any objectGraphics Remember objects also have typesCGIscripting Functions are of type functionPackaging Integers have type int etc.Extensions Use the type() function to get an objects type.GlossaryAnnexes 28
    • MSc BIOINFO Example2011-2012 Create a module called firstExercise.py. Define theIntro following functions and variables in the module:Functional A function called objectDocumention which takes oneprogram-ming argument and returns the doc string of the argument.Classes A function called objectName which takes oneExceptions argument and returns its __name__ attribute.BioPythonGraphics A function called multiply(a, b) which returns a × b.CGI Try passing objects other than numbers.scriptingPackaging A function called integerMultiply(a,b) whichExtensions converts its arguments to integers before multiplyingGlossary them. Hint: Use the function int() to convert objects toAnnexes integers. Try with mixed numbers and strings Load the module from the interactive shell and test it. 29
    • MSc BIOINFO2011-2012IntroFunctionalprogram- Exampleming Write a program (Python script) named madlib.py, whichClasses asks the user to enter a series of nouns, verbs, adjectives,Exceptions adverbs, plural nouns, past tense verbs, etc., and thenBioPythonGraphics generates a paragraph which is syntactically correct butCGI semantically ridiculousscriptingPackagingExtensionsGlossaryAnnexes 30
    • Coercion MSc BIOINFO2011-2012 Converting an object from one type to another is called coercionIntro >>> x=2Functionalprogram- >>> y=3.ming >>> coerce(x,y)Classes (2.0, 3.0)ExceptionsBioPython However not all objects can be coerced.Graphics When performing numeric operations the object with theCGIscripting smaller type is converted to the larger type.Packaging When using and or or the left hand operand is convertedExtensions to a bool.Glossary The standard coercion functions for the types we haveAnnexes seen so far are int(), float(), str(), bool() 31
    • Bool conversions MSc BIOINFO2011-2012IntroFunctionalprogram-ming Any non-zero number or non-empty object converts toClassesExceptions TrueBioPython A zero number or an empty object is False.GraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 32
    • Operator overloading MSc BIOINFO Operators that perform different actions depending on2011-2012 the types of the operands are said to be overloadedIntro *Functional Multiplies when the operands are both numbersprogram-ming Replicates when one is a number and the other a stringClasses +Exceptions Adds when the operands are both numbersBioPython Concatenates when the operands are both strings.Graphics Many operators in python are overloaded.CGIscripting Notice that when the operands do not support thePackaging operator python raises an error. There is no point inExtensions checking your self.Glossary Also when the operators meaning is ambiguous an errorAnnexes is raised: using + with a string and a number - addition or concatenation? 33
    • Some other terminology MSc BIOINFO2011-2012IntroFunctional Assigning an object to a name e.g. a = 3,program-ming firstFunction = secondFunction, is often calledClasses binding.Exceptions Changing what a name refers to is called rebinding.BioPython a = 3 Binds the name a to the object 3GraphicsCGI a = "aString" Rebinds the name a to the objectscripting "aString"PackagingExtensionsGlossaryAnnexes 34
    • Strings MSc BIOINFO A string is an ordered collection of characters.2011-2012 They are immutable i.e. They cannot be changed.Intro You can create strings usingFunctional Double quotes - ""program-ming Single quotes - ”Classes Triple quotes ”’ ”’ - i.e. Doc strings.Exceptions Double and single quotes are the sameBioPython Triple quotes create block strings which can spanGraphics multiple lines.CGIscripting h e l l o = " This i s a r a t h e r long s t r i n g containing n Packaging s e v e r a l l i n e s o f t e x t j u s t as you would do i n C . n Extensions Note t h a t whitespace a t t h e b e g i n n i n g o f t h e l i n e i s Glossary significant . "Annexes print hello 35
    • Basic String Operations MSc BIOINFO2011-2012 We’ve already seen * (replicate) and + (concatenate) Since strings are ordered collection of characters we canIntro access their components by indexingFunctionalprogram-ming The first character in the string has position 0.Classes The position of the last character is equal to the numberExceptions of characters in the string -1.BioPython [] is the index operatorGraphics aString = "Genial"CGIscripting aString[1]Packaging You can also index from the end using negative numbersExtensions aString[-1] (This is the position = number ofGlossary characters in the string -1) "Genial" = length is 6Annexes "Genial"[-1] is position 6 - 1 = 5 ("l") 36
    • Slicing MSc BIOINFO2011-2012 Slicing takes specified parts of a string and creates a new string.IntroFunctional [start:end] Take from position start up to but notprogram-ming including position endClasses Astring[1:3]Exceptions If start is blank i.e. [:end]. It means from the firstBioPython positionGraphicsCGI If end is blank i.e. [start:]. It means go to the lastscripting positionPackagingExtensions Extended slicing [start:end:step]Glossary [1:10:2] - Get the characters from 1 to 10 taking stepsAnnexes of 2. 37
    • String examples MSc BIOINFO2011-2012 >>> word = ’ Help ’ + ’A ’ >>> wordIntro ’ HelpA ’Functional >>> ’ < ’ + word∗5 + ’ > ’program-ming ’ <HelpAHelpAHelpAHelpAHelpA > ’ClassesExceptions >>> ’ s t r ’ ’ i n g ’ # <− T h i s i s okBioPython ’ string ’ >>> ’ s t r ’ . s t r i p ( ) + ’ ing ’ # <− T h i s i s okGraphics ’ string ’CGIscripting >>> ’ s t r ’ . s t r i p ( ) ’ ing ’ # <− This i s i n v a l i d File "<stdin >" , l i n e 1 , in ?Packaging ’ str ’ . strip () ’ ing ’Extensions ^Glossary SyntaxError : i n v a l i d syntaxAnnexes 38
    • String examples MSc BIOINFO >>> word [ 0 ] = ’ x ’2011-2012 Traceback ( most r e c e n t c a l l l a s t ) : F i l e "<stdin >" , l i n e 1 , in ?Intro TypeError : o b j e c t does not s u p p o r t i t e m assignmentFunctionalprogram- >>> word [ − 2 : ] # The l a s t two c h a r a c t e r sming ’ pA ’Classes >>> word [ −1 0 0 :]Exceptions ’ HelpA ’BioPython >>> word [ −10] # errorGraphics Traceback ( most r e c e n t c a l l l a s t ) : F i l e "<stdin >" , l i n e 1 , in ?CGIscripting I n d e x E r r o r : s t r i n g i n d e x o u t o f rangePackaging +---+---+---+---+---+ExtensionsGlossary | H | e | l | p | A |Annexes +---+---+---+---+---+ 0 1 2 3 4 5 -5 -4 -3 -2 -1 39
    • String examples MSc BIOINFO2011-2012 s t r i n g 1 = " A , B , C, D, E , F "Intro print " String is : " , string1Functionalprogram- print " Split string by spaces : " , s t r i n g 1 . s p l i t ( )ming print " Split string by commas : " , s t r i n g 1 . s p l i t ( " , " )Classes print " Split string by commas , max 2 : " , s t r i n g 1 . s p l i t ( " , "Exceptions printBioPython Removing leading and/or trailing characters in a string:GraphicsCGI string1 = " t n This i s a t e s t s t r i n g . t t n"scripting p r i n t ’ O r i g i n a l s t r i n g : "%s " n ’ % s t r i n g 1Packaging p r i n t ’ Using s t r i p : "%s " n ’ % s t r i n g 1 . s t r i p ()Extensions p r i n t ’ Using l e f t s t r i p : "%s " n ’ % s t r i n g 1 . lstrip ()Glossary p r i n t ’ Using r i g h t s t r i p : "%s " n ’ % s t r i n g 1 . rstrip ()Annexes 40
    • Lists MSc BIOINFO2011-2012 Lists contain ordered collections of any type of object:Intro Numbers, strings, other lists.Functionalprogram- List Properties:ming MutableClasses Can change the object at any positionExceptions Can add and remove items from a list (more later)BioPython HeterogenousGraphics Can contain a mixture of dataCGIscripting Creating a listPackaging myList = []Extensions myList = [3, 4, "Jordi"]Glossary myList = ["aString", [3, 4, "Jordi"]]Annexes 41
    • List Operations MSc BIOINFO2011-2012 A list like a string is a sequence. All the operators thatIntro work on strings work on lists (more overloading)Functionalprogram- * (replication)ming + (concatenation)Classes [] (indexing)Exceptions [:] (slicing)BioPython In addition a list is mutable - you can assign to listGraphics positionsCGIscripting Index assignment: myList[3] = "Hello"Packaging Slice assignment: MyList[0:3] = [0,1] (Two steps:Extensions Deletion - the slice on the left is deleted; Insertion - the slice on the right is inserted in its place.GlossaryAnnexes 42
    • List Operations MSc BIOINFO2011-2012Intro Trying to access a position that does not exist in aFunctionalprogram- sequence is an errorming The function len() returns the number of items in aClasses sequence.ExceptionsBioPython There are two more sequence operatorsGraphics x in sequence evaluates as True if the object x is inCGI the sequence or false if its not. e.g. 3 in [1,2,3], "J"scripting in "Jordi"Packaging x not in sequence, the opposite of in.ExtensionsGlossaryAnnexes 43
    • Examples with lists MSc BIOINFO2011-2012 >>> q = [2 , 3]Intro >>> p = [1 , q , 4]Functional >>> len ( p )program-ming 3Classes >>> p[1] [2 , 3]Exceptions >>> p[1][0]BioPython 2Graphics >>> p [ 1 ] . append ( ’ x t r a ’ )CGI >>> pscripting [1 , [2 , 3 , ’ xtra ’ ] , 4]Packaging >>> qExtensions [2 , 3, ’ xtra ’ ]GlossaryAnnexes 44
    • Shallow vs Deep list copy MSc BIOINFO2011-2012 Shallow Copy: (copies chunks of memory from one location to another)IntroFunctional a = [ ’ one ’ , ’ two ’ , ’ t h r e e ’ ]program- b = a[:]ming b[1] = 2Classes p r i n t i d ( a ) , a # Output : 1077248300 [ ’ one ’ , ’ two ’ , ’ t h r e e ’ ]Exceptions p r i n t i d ( b ) , b # Output : 1077248908 [ ’ one ’ , 2 , ’ t h r e e ’ ]BioPythonGraphics Deep Copy: (Copies object reference)CGI a = [ ’ one ’ , ’ two ’ , ’ t h r e e ’ ]scripting b = aPackaging b[1] = 2Extensions p r i n t i d ( a ) , a # Output : 1077248300 [ ’ one ’ , 2 , ’ t h r e e ’ ]Glossary p r i n t i d ( b ) , b # Output : 1077248300 [ ’ one ’ , 2 , ’ t h r e e ’ ]Annexes 45
    • The del statement MSc BIOINFO2011-2012Intro >>> a = [ −1 , 1 , 6 6 . 2 5 , 333 , 333 , 1 2 3 4 . 5 ]Functional >>> del a [ 0 ]program-ming >>> aClasses [1 , 6 6 . 2 5 , 333 , 333 , 1 2 3 4 . 5 ]Exceptions >>> del a [ 2 : 4 ] >>> aBioPython [1 , 66.25 , 1234.5]Graphics >>> del a [ : ]CGIscripting >>> a []PackagingExtensionsGlossaryAnnexes 46
    • if statement MSc BIOINFO2011-2012 i f test1 :Intro <statements1 >Functionalprogram- e l i f test2 :ming <statements2 >Classes else :Exceptions <statements3 >BioPythonGraphics All code that exists in the if statement must be indentedCGI (there are no braces etc.)scriptingPackaging Expression is any python expression that evaluates to aExtensions boolean i.e True or FalseGlossaryAnnexes
    • Example if statement MSc BIOINFO2011-2012 >>> x = i n t ( r a w _ i n p u t ( " Please e n t e r an i n t e g e r : " ) )Intro Please e n t e r an i n t e g e r : 42Functional >>> i f x < 0 :program-ming ... x = 0Classes ... p r i n t ’ Negative changed t o zero ’ . . . e l i f x == 0 :Exceptions ... p r i n t ’ Zero ’BioPython . . . e l i f x == 1 :Graphics ... print ’ Single ’CGI . . . else :scripting ... p r i n t ’ More ’Packaging ...Extensions MoreGlossaryAnnexes 48
    • While loops MSc BIOINFO while t e s t :2011-2012 <statements >Intro Repeatedly executes <statements> until test is trueFunctional Example:program-ming >>> # Fibonacci series :Classes ... # t h e sum o f two elements d e f i n e s t h e n e x tExceptions ... a, b = 0, 1 >>> while b < 1 0 :BioPython ... print bGraphics ... a , b = b , a+bCGI ...scripting 1Packaging 1Extensions 2Glossary 3Annexes 5 8 49
    • for loop MSc BIOINFO for < target > in <object >:2011-2012 <statements >IntroFunctional When python runs this loop it assigns the elements inprogram- <object>, one by one to the variable <target>mingClasses Remember <target> is only a reference to an item inExceptions the sequence. Rebinding <target> does not changeBioPython the item in the sequence.Graphics To change the elements of a list you need to use theCGIscripting range() function.PackagingExtensions ExampleGlossary try changing the characters of "Peter" to "Roman" byAnnexes different methods (use while, for, ...) 50
    • for loop examples MSc BIOINFO2011-2012 >>> # Measure some s t r i n g s : . . . a = [ ’ c a t ’ , ’ window ’ , ’ d e f e n e s t r a t e ’ ]Intro >>> f o r x i n a :Functional ... print x , len ( x )program-ming ...Classes cat 3 window 6Exceptions d e f e n e s t r a t e 12BioPythonGraphics >>> f o r x i n a [ : ] : # make a s l i c e copy o f t h e e n t i r e l i s tCGIscripting ... i f len ( x ) > 6: a . i n s e r t (0 , x ) ...Packaging >>> aExtensions [ ’ d e f e n e s t r a t e ’ , ’ c a t ’ , ’ window ’ , ’ d e f e n e s t r a t e ’ ]GlossaryAnnexes a.insert(len(a), x) is equivalent to a.append(x) 51
    • for loop examples MSc BIOINFO2011-2012IntroFunctional >>> a = [ ’ Mary ’ , ’ had ’ , ’ a ’ , ’ l i t t l e ’ , ’ lamb ’ ]program- >>> f o r i i n range ( l e n ( a ) ) :ming ... print i , a [ i ]Classes ...Exceptions 0 MaryBioPython 1 hadGraphics 2 aCGI 3 littlescripting 4 lambPackagingExtensionsGlossaryAnnexes 52
    • Loop statements MSc BIOINFO2011-2012Intro break Jumps out of the innermost loop. Use when youFunctional want a loop to end immediately due to some conditionprogram-ming being reachedClasses continue Jumps to the top of the innermost loop. UseExceptions when you dont want to execute any more code for thisBioPython iterationGraphicsCGI pass for empty loopsscriptingPackaging else block, Executed if a loop was not exited due to aExtensions break statementGlossaryAnnexes 53
    • Some examples MSc BIOINFO >>> f o r n i n range ( 2 , 1 0 ) :2011-2012 ... f o r x i n range ( 2 , n ) : ... i f n % x == 0 :Intro ... p r i n t n , ’ equals ’ , x , ’ ∗ ’ , n / xFunctional ... breakprogram- ... else :ming ... # loop f e l l through without f i n d i n g a f a c t o rClasses ... p r i n t n , ’ i s a prime number ’Exceptions ...BioPython 2 i s a prime numberGraphics 3 i s a prime numberCGI 4 equals 2 ∗ 2scripting 5 i s a prime numberPackaging 6 equals 2 ∗ 3Extensions 7 i s a prime numberGlossary 8 equals 2 ∗ 4Annexes 9 equals 3 ∗ 3 >>> while True : 54 ... pass # Busy−w a i t f o r keyboard i n t e r r u p t ( C t r l +C)
    • List comprehensions MSc BIOINFO2011-2012 >>> l i = [1 , 9 , 8 , 4] >>> [ elem∗2 f o r elem i n l i ]Intro [2 , 18 , 16 , 8 ]Functional >>> liprogram-ming [1 , 9 , 8 , 4]Classes >>> l i = [ elem∗2 f o r elem i n l i ] >>> liExceptions [2 , 18 , 16 , 8 ]BioPythonGraphics look at it from right to left. li is the list you’re mappingCGIscripting >>> params = { " s e r v e r " : " m p i l g r i m " , " database " : " master " , " uPackaging >>> [ "%s=%s " % ( k , v ) f o r k , v i n params . i t e m s ( ) ] [ ’ s e r v e r = m p i l g r i m ’ , ’ u i d =sa ’ , ’ database=master ’ , ’ pwd= s e c rExtensions >>> " ; " . j o i n ( [ "%s=%s " % ( k , v ) f o r k , v i n params . i t e m s ( ) ]Glossary ’ s e r v e r = m p i l g r i m ; u i d =sa ; database=master ; pwd= s e c r e t ’Annexes 55
    • Examples list comphrehensions MSc BIOINFO2011-2012Intro >>> vec1 = [ 2 , 4 , 6 ]Functional >>> vec2 = [ 4 , 3 , −9]program-ming >>> [ x∗y f o r x i n vec1 f o r y i n vec2 ]Classes [ 8 , 6 , −18, 16 , 12 , −36, 24 , 18 , −54]Exceptions >>> [ x+y f o r x i n vec1 f o r y i n vec2 ] [ 6 , 5 , −7, 8 , 7 , −5, 10 , 9 , −3]BioPython >>> [ vec1 [ i ] ∗ vec2 [ i ] f o r i i n range ( l e n ( vec1 ) ) ]Graphics [ 8 , 12 , −54]CGIscripting >>> [ s t r ( round ( 3 5 5 / 1 1 3 . 0 , i ) ) f o r i i n range ( 1 , 6 ) ] [ ’ 3 . 1 ’ , ’ 3.14 ’ , ’ 3.142 ’ , ’ 3.1416 ’ , ’ 3.14159 ’ ]PackagingExtensionsGlossaryAnnexes 56
    • Files MSc BIOINFO2011-2012 The file object in python represents a file that you can read from and write toIntroFunctional Unlike the other python objects you can not useprogram- operators on them e.g. +, *, [] etc.mingClasses CreationExceptions myFile = open ( l o c a t i o n )BioPythonGraphics Some MethodsCGI read()scripting readline()Packaging readlines()Extensions write()Glossary writelines()Annexes close() 57
    • File manipulation examples MSc BIOINFO2011-2012 http://docs.python.org/library/stdtypes.html?Intro highlight=tell#file.tell http:Functionalprogram- //docs.python.org/tutorial/inputoutput.htmlming f i l e H a n d l e = open ( ’ t e s t . t x t ’ , ’w ’ )Classes f i l e H a n d l e . w r i t e ( ’ T e s t i n g f i l e s i n Python . n e a s i l y ’ )Exceptions fileHandle . close ( )BioPython f i l e H a n d l e = open ( ’ t e s t . t x t ’ , ’ a ’ )Graphics f i l e H a n d l e . w r i t e ( ’ n n nBottom l i n e . ’ )CGI fileHandle . close ( )scripting f i l e H a n d l e = open ( ’ t e s t . t x t ’ )Packaging p r i n t f i l e H a n d l e . read ( )Extensions fileHandle . close ( )GlossaryAnnexes 58
    • File manipulation examples MSc BIOINFO2011-2012 f i l e H a n d l e = open ( ’ t e s t . t x t ’ )Intro print fileHandle . readline ( )Functional print f i l e H a n d l e . t e l l ( ) # p o s i t i o n w i t h i n the f i l eprogram- print fileHandle . readline ( )ming f i l e H a n d l e = open ( ’ t e s t . t x t ’ )Classes p r i n t f i l e H a n d l e . read ( 1 )Exceptions f i l e H a n d l e . seek ( 4 )BioPython p r i n t F i l e H a n d l e . read ( 1 )Graphics f i l e H a n d l e = open ( ’ t e s t B i n a r y . t x t ’ , ’wb ’ )CGI f i l e H a n d l e . w r i t e ( ’ There i s no spoon . ’ )scripting fileHandle . close ( )Packaging f i l e H a n d l e = open ( ’ t e s t B i n a r y . t x t ’ , ’ r b ’ )Extensions p r i n t f i l e H a n d l e . read ( )Glossary fileHandle . close ( )Annexes 59
    • More sophisticated file manipulation MSc BIOINFO http://docs.python.org/library/glob.html2011-2012 import os , glob , s h u t i lIntro f i l e _ e x t = r a w _ i n p u t ( " E xt e n s i o n f o r t h e f i l e s : n " ) f i l e _ c o u n t = r a w _ i n p u t ( " F i l e s count i n each new d i r : n " )Functionalprogram- file_count = int ( file_count )ming dir_base_name = r a w _ i n p u t ( " name base f o r d i r s : n " )Classes filenames = glob . glob ( ( ’ ∗ . ’ + f i l e _ e x t ) )Exceptions filenames . s o r t ( )BioPython dir_number = 0Graphics while f i l e n a m e s :CGI dir_number += 1scripting new_dir = dir_base_name + s t r ( dir_number )Packaging os . mkdir ( new_dir )Extensions f o r n i n range ( min ( f i l e _ c o u n t , l e n ( f i l e n a m e s ) ) ) :Glossary s r c _ f i l e = f i l e n a m e s . pop ( 0 ) s h u t i l . copy ( s r c _ f i l e , new_dir )Annexes os . u n l i n k ( s r c _ f i l e ) 60
    • Methods MSc BIOINFO2011-2012 We have seen that everything in python is an object and that all objects have attributes. The attributes can haveIntro different types e.g string, int, functionFunctionalprogram- Another type of attribute an object can have is called aming methodClassesExceptions An objects methods are special functions that operate onBioPython the object itself.Graphics invoked with object.method() the method doesCGIscripting something with objectPackaging Some objects like modules have no methods or veryExtensions rarely used methods e.g. Functions and numbers.Glossary Lists and strings have many very commonly usedAnnexes methods. 61
    • Example: String methods MSc BIOINFO2011-2012 Here are some string methods capitalizeIntro countFunctionalprogram- findming indexClasses splitExceptions Some methods take arguments, others don’t.BioPythonGraphics Check http:CGI //docs.python.org/lib/string-methods.htmlscripting for a full description of the string methods.PackagingExtensions Check http://docs.python.org/lib/Glossary typesseq-mutable.html for a description of listAnnexes methods. 62
    • Object attributes MSc BIOINFO We have seen that objects can have many attributes and2011-2012 that all attributes are objects. (Remember dir()) Generally an object’s attributes are divided into two typesIntro Callable - They can perform some action and return aFunctionalprogram- result: Functions, methodsming Not callable - Everything else (strings, lists, numbers etc.)Classes You can check if an object is callable using theExceptions callable() function.BioPython Another useful function is getattr()GraphicsCGI getattr() returns an attribute of an object if you knowscripting its name as a string.Packaging >>> l i = [ " L a r r y " , " C u r l y " ]Extensions >>> g e t a t t r ( l i , " pop " )Glossary < b u i l t −i n method pop o f l i s t o b j e c t a t 010DF884>Annexes >>> v a l u e = o b j . a t t r i b u t e >>> v a l u e = g e t a t t r ( obj , " a t t r i b u t e " ) 63
    • Augmented assignment MSc BIOINFO2011-2012Intro Based on CFunctional Short hand for writing common expressionsprogram-ming Traditional: X = X + YClasses Augmented: X += YExceptions X *= Y, X -=Y, X /= Y etc.BioPython Less typingGraphics Automatically chooses optimal methodCGIscripting L = L + [3,4]Packaging L.extend([3,4])Extensions L += [3,4] - Automatically chooses extendGlossaryAnnexes 64
    • String formatting MSc BIOINFO %2011-2012 Format operator.Intro You place a string to the right of the operator withFunctionalprogram- conversion targets embedded in it.mingClasses A conversion target is a % followed by a letter. The letterExceptions indicates the conversion to be performedBioPython On the right of the format operator you place, inGraphics parentheses, one object for each conversion target in theCGIscripting string.Packaging Python inserts each object into the string, the first at theExtensions first conversion target etc, performing the necessaryGlossary conversion first.Annexes "Name %s. Age %d" % ("Joe", 52) 65
    • Extended formatting MSc BIOINFO2011-2012 Since all basic objects in python have a string descriptionIntro usually %s is all thats neededFunctional However with numbers more control is often required.program-ming %d, %e, %E, %fClasses Extended formatting syntaxExceptions %[flags][width][.precision]codeBioPython FlagsGraphics - left justifyCGIscripting + add plus for positive numbers 0 pad with zerosPackagingExtensions Width is the maximum width the conversion can haveGlossary .precision is the number of places after the decimalAnnexes point. 66
    • String formatting vs. concatenating MSc BIOINFO2011-2012 >>> u i d = " sa " >>> pwd = " s e c r e t "Intro >>> p r i n t pwd + " i s n o t a good password f o r " + u i dFunctional s e c r e t i s not a good password f o r saprogram-ming >>> p r i n t "%s i s n o t a good password f o r %s " % ( pwd , u i d )Classes s e c r e t i s not a good password f o r sa >>> userCount = 6Exceptions >>> p r i n t " Users connected : %d " % ( userCount , )BioPython Users connected : 6Graphics >>> p r i n t " Users connected : " + userCountCGI Traceback ( i n n e r m o s t l a s t ) :scripting F i l e "< i n t e r a c t i v e input >" , l i n e 1 , in ?Packaging TypeError : cannot concatenate ’ s t r ’ and ’ i n t ’ o b j e c t sExtensionsGlossary See also http:Annexes //docs.python.org/tutorial/inputoutput.html 67
    • Tuples MSc BIOINFO A tuple is an immutable list with no methods2011-2012 Ordered collection of arbitrary objects CreationIntro () e.g. (3, "Name")Functionalprogram- , e.g. 3, "Name" (Not advisable)ming A tuple with a single element is a special case: (40,) -Classes require a trailing commaExceptions Can be operated on by all the immutable sequenceBioPython operatorsGraphics *, +, [], [:], inCGI Accessed by position starting from 0scripting Use len() to get length of a tuplePackagingExtensions Note than only the tuple is immutable. Mutable objects inGlossary a tuple are still mutable.Annexes Tuples provide integrity (one needs to be sure that something cannot be changed) 68
    • Using tuples to assign values MSc BIOINFO2011-2012Intro >>> v = ( ’a ’ , ’b ’ , ’e ’ )Functional >>> (x , y , z) = vprogram-ming >>> xClasses ’a ’Exceptions >>> y ’b ’BioPython >>> zGraphics ’e ’CGIscripting v is a tuple of three elements, and (x, y, z) is a tuple ofPackaging three variables.ExtensionsGlossaryAnnexes 69
    • Sequence conversion MSc BIOINFO2011-2012Intro Like int(), float() etc. there are functions forFunctionalprogram- converting objects to lists & tuples.ming list()ClassesExceptions tuple()BioPython These functions can only coerce objects that are alsoGraphics sequences i.e. strings, lists, tuplesCGIscripting list(3) - will not workPackaging list("3") - will workExtensionsGlossaryAnnexes 70
    • Sequence functions MSc BIOINFO filter()2011-2012 Filters the elements of a sequence based on a function and produces a new sequenceIntro map()Functionalprogram- Applies a function to every element of a sequence andming returns a list of the results.Classes Can be used with multiple listsExceptions reduce()BioPython Applies a function to the items of a sequence from left toGraphics right to reduce the list to a single value.CGI Calls the function using the first two values of thescripting sequence. Then on the result and the third item etc.Packaging zip()Extensions Takes any number of lists as argumentsGlossary Returns a list of tuples where the first contains the firstAnnexes element of each sequence, the second the second element of each etc. 71
    • MSc BIOINFO2011-2012Intro >>> f o o = [ 2 , 18 , 9 , 22 , 17 , 24 , 8 , 12 , 2 7 ]Functional >>>program-ming >>> p r i n t f i l t e r ( lambda x : x % 3 == 0 , f o o )Classes [ 1 8 , 9 , 24 , 12 , 2 7 ] >>>Exceptions >>> p r i n t map( lambda x : x ∗ 2 + 10 , f o o )BioPython [ 1 4 , 46 , 28 , 54 , 44 , 58 , 26 , 34 , 6 4 ]Graphics >>>CGI >>> p r i n t reduce ( lambda x , y : x + y , f o o )scripting 139PackagingExtensionsGlossaryAnnexes 72
    • Example of the use of filter MSc BIOINFO >>> def odd ( n ) :2011-2012 ... r e t u r n n%2 ...Intro >>> l i = [ 1 , 2 , 3 , 5 , 9 , 10 , 256 , −3]Functional >>> f i l t e r ( odd , l i )program- [1 , 3 , 5 , 9 , −3]ming >>> filteredList = []Classes >>> for n in l i :Exceptions ... i f odd ( n ) :BioPython ... f i l t e r e d L i s t . append ( n )Graphics ...CGI >>> filteredListscripting [1 , 3 , 5 , 9 , −3]PackagingExtensions odd returns 1 if n is odd and 0 if n is even.Glossary filter takes two arguments, a function (odd) and a listAnnexes (li). It loops through the list and calls odd per element. You could accomplish the same thing with a for loop. But 73 at the cost of less compact code.
    • Example of the use of zip MSc BIOINFO2011-2012 >>> mat = [Intro ... [1 , 2 , 3] ,Functionalprogram- ... [4 , 5 , 6] ,ming ... [7 , 8 , 9] ,Classes ... ]Exceptions >>> z i p ( ∗ mat )BioPython [(1 , 4 , 7) , (2 , 5 , 8) , (3 , 6 , 9)]Graphics names = [ " Jesus " , " Marc " , " M i c h a l " , " Graham " ]CGIscripting p l a c e s = [ " Spain " , "USA" , " Poland " , "UK" ]Packaging combo = z i p ( names , p l a c e s )Extensions who = d i c t ( combo )GlossaryAnnexes 74
    • Examples of the use of map MSc BIOINFO2011-2012Intro >>> p r i n t map( lambda w: l e n (w) ,Functionalprogram- ’ I t i s r a i n i n g c a t s and dogs ’ . s p l i t ( ) )ming [2 , 2 , 7 , 4 , 3 , 4]ClassesExceptions >>>map( f , sequence )BioPython >>>[ f ( x ) f o r x i n sequence ]GraphicsCGI >>>map( f , sequence1 , sequence2 )scripting >>>[ f ( x1 , x2 ) f o r x1 , x2 i n z i p ( sequence1 , sequence2 ) ]PackagingExtensionsGlossaryAnnexes 75
    • Exercises MSc BIOINFO2011-2012 ExampleIntro Write a code that computes the prime numbers up to 50 (hint:Functionalprogram- use the filter function)mingClasses ExampleExceptionsBioPython Write a code that writes a value table (x, f (x)) forGraphics f (x) = sin(x) (hint: use the map function)CGIscriptingPackaging ExampleExtensions Write a code that calculates the geometric mean of a givenGlossary list of values (hint: use the reduce function)Annexes 76
    • Dictionaries MSc BIOINFO2011-2012Intro Dictionaries are mappingsFunctionalprogram- Unordered collection of objects (Python 3 includes order)ming Access items via a key (case sensitive)Classes Equivalent to hashes in perlExceptions Very fast retrievalBioPython MutableGraphics CreationCGIscripting {} - an empty dictionaryPackaging {’age’: 40, ’name’: "unknown"}ExtensionsGlossaryAnnexes 77
    • Example dictionaries MSc BIOINFO >>> d = { " s e r v e r " : " m p i l g r i m " , " database " : " master " }2011-2012 >>> d { ’ s e r v e r ’ : ’ m p i l g r i m ’ , ’ database ’ : ’ master ’ }Intro >>> d [ " s e r v e r " ]Functional ’ mpilgrim ’program-ming >>> d [ " database " ]Classes ’ master ’ >>> d [ " database " ] = " pubs "Exceptions >>> dBioPython { ’ s e r v e r ’ : ’ m p i l g r i m ’ , ’ database ’ : ’ pubs ’ }Graphics >>> d [ " u i d " ] = " sa "CGI >>> dscripting { ’ s e r v e r ’ : ’ m p i l g r i m ’ , ’ u i d ’ : ’ sa ’ , ’ database ’ : ’ pubs ’ }Packaging >>> del d [ ’ u i d ’ ]Extensions >>> d [ " m p i l g r i m " ]Glossary Traceback ( i n n e r m o s t l a s t ) :Annexes F i l e "< i n t e r a c t i v e input >" , l i n e 1 , in ? KeyError : m p i l g r i m 78
    • Dictionary operations MSc BIOINFO2011-2012Intro AccessingFunctional Dict[key]program- len() - Returns the number of stored entriesmingClasses AssignmentExceptions Dict[key] = objectBioPython RemovalGraphics del Dict[key]CGIscripting The del statement can be used with lists or attributes etc.Packaging ConstructionExtensions dict(zip(keys, values))GlossaryAnnexes 79
    • Dictionary methods MSc BIOINFO2011-2012IntroFunctionalprogram-ming has_key()Classes keys()ExceptionsBioPython values()Graphics copy() . . .CGIscriptingPackagingExtensionsGlossaryAnnexes 80
    • Note on function arguments MSc BIOINFO >>> range ( 3 , 6 ) # normal c a l l w i t h s e p a r a t e arg2011-2012 [3 , 4 , 5] >>> args = [ 3 , 6 ]Intro >>> range ( ∗ args ) # c a l l w i t h arguments unpackedFunctional [3 , 4 , 5]program-ming def cheeseshop ( kind , ∗ arguments , ∗∗ keywords ) :Classes p r i n t "−− Do you have any " , kind , " ? "Exceptions p r i n t "−− I ’m s o r r y , we ’ r e a l l o u t o f " , k i n dBioPython f o r arg i n arguments : p r i n t argGraphics p r i n t "−" ∗ 40 keys = keywords . keys ( )CGIscripting keys . s o r t ( )Packaging f o r kw i n keys : p r i n t kw , " : " , keywords [ kw ]Extensions cheeseshop ( " Limburger " , " I t ’ s v e r y runny , s i r . " , " I t ’ s r e a l l y very , VERY runny , s i r . " ,Glossary shopkeeper= ’ Michael P a l i n ’ ,Annexes c l i e n t = " John Cleese " , s k e t c h = " Cheese Shop Sketch " ) 81
    • Naming convention MSc BIOINFO docstrings:2011-2012 http://www.python.org/dev/peps/pep-0257/,Intro and general text:Functional http://www.python.org/dev/peps/pep-0008/.program-ming Function names should describe what the function does.Classes The more general the better though there is a balance.Exceptions Name should be enough to give an idea of what it does. General does not mean short! Use full wordsBioPythonGraphics Arguments names should be as general as possible.CGI object, aString, aFunction, comparisonFunction.scripting A variabe name should describe what it is.Packaging Use full words.Extensions You should not use reserved words (see page 9).Glossary Names beginning and ending in two __ are systemAnnexes defined names and have a special meaning for the interpreter 82
    • finding substrings MSc BIOINFO2011-2012 >>> dna = " " " t t c a c c t a g t c t a g g a c c c a c t a a t g c a g a t c c t g t g tgtctagctaagatgtattatatctatattcactgggcttattgggccaaIntro tgaaaatatgcaagaaaggaaaaaaaagatgtagacaaggaattctattt " " "Functionalprogram- >>> E= ’ g a t ’ming >>> dna . f i n d (E )Classes 48Exceptions >>> dna . i n d e x ( E)BioPython 48Graphics Try looking for a non-existing substring with both methodsCGIscripting ExamplePackagingExtensions Write a function that returns the list of codons for a DNAGlossary sequence and a given frameAnnexes 83
    • First view at regular expressions MSc http://python.about.com/od/regularexpressions/a/ BIOINFO2011-2012 regexprimer.htm http://docs.python.org/howto/regex.html#regex-howtoIntro >>> import r eFunctionalprogram- >>> m = r e . search ( ’ (? <= abc ) d e f ’ , ’ abcdef ’ )ming >>> m. group ( 0 )Classes ’ def ’Exceptions >>> m = r e . search ( ’ (? <= −)w+ ’ , ’ spam−egg ’ )BioPython >>> m. group ( 0 )Graphics ’ egg ’ >>> m = r e . match ( r " ( w+) ( w+) " , " I s a a c Newton , p h y s i c i s t "CGIscripting >>> m. group ( 0 ) # The e n t i r e matchPackaging ’ I s a a c Newton ’ >>> m. group ( 1 ) # The f i r s t p a r e n t h e s i z e d subgroup .Extensions ’ Isaac ’Glossary >>> m. group ( 2 ) # The second p a r e n t h e s i z e d subgroup .Annexes ’ Newton ’ >>> m. group ( 1 , 2 ) # M u l t i p l e arguments g i v e us a t u p l e . 84 ( ’ I s a a c ’ , ’ Newton ’ )
    • Writing regex MSc BIOINFO2011-2012 compile() Compile a regular expression pattern into a regular expression object, which can be used forIntro matching using its match() and search()Functionalprogram- methodsmingClasses search() Scan through string looking for a location whereExceptions the regular expression pattern produces aBioPython match, and return a corresponding MatchObjectGraphics instance.CGIscripting match() If zero or more characters at the beginning ofPackaging string match the regular expression pattern,Extensions return a corresponding MatchObject instanceGlossary split() Split string by the occurrences of patternAnnexes http://docs.python.org/dev/howto/regex.html 85
    • Regular expressions MSc BIOINFO2011-2012 A regular expression is a pattern that a string is searched for.Intro Unix commands such as "rm *.*" are similar to regularFunctional expressions, but the syntax of regular expressions is moreprogram-ming elaborated. Several Unix programs (grep, sed, awk, ed, vi,Classes emacs) use regular expressions and many modernExceptions programming languages (such as Java) also support them. InBioPython Python, a regular expression is first compiled:Graphics keyword = r e . compile ( r " t h e " )CGIscripting keyword . search ( l i n e )Packaging not keyword . search ( l i n e ) keyword = r e . compile ( v a r i a b l e )Extensions keyword = r e . compile ( r " t h e " , r e . I ) # f o r i n s e n s i t i v e searchGlossaryAnnexes 86
    • re.finditer() MSc BIOINFO2011-2012IntroFunctional import r eprogram-ming import u r l l i b 2Classes h t m l = u r l l i b 2 . u r l o p e n ( ’ h t t p : / / c b b l . imim . es ’ ) . read ( )Exceptions p a t t e r n = r ’ b ( t h e s + w+ ) s+ ’BioPython regex = r e . compile ( p a t t e r n , r e .IGNORECASE)Graphics f o r match i n regex . f i n d i t e r ( h t m l ) :CGI p r i n t "%s : %s " % ( match . s t a r t ( ) , match . group ( 1 ) )scriptingPackagingExtensionsGlossaryAnnexes 87
    • MSc BIOINFO2011-2012Intro ExampleFunctional Given a string of A, C, T, and G, and X, find a string where Xprogram-ming matches any single character, e.g., CATGG is contained inClasses ACTGGGXXAXGGTTT.ExceptionsBioPython ExampleGraphicsCGI Write a regular expression to extract the coding sequencescripting from a DNA string. It starts with the ATG codon and ends withPackaging a stop codon (TAA, TAG, or TGA).ExtensionsGlossaryAnnexes 88
    • Regular expressions MSc BIOINFO2011-2012IntroFunctionalprogram- >>> import r eming >>> r e . f i n d a l l ( r ’ b f [ a−z ] ∗ ’ , ’ which f o o t o r hand f e l l f a s tClasses [ ’ foot ’ , ’ f e l l ’ , ’ fastest ’ ]Exceptions >>> r e . sub ( r ’ ( b [ a−z ] + ) 1 ’ , r ’ 1 ’ , ’ c a t i n t h e t h e h a t ’ )BioPython ’ cat i n the hat ’Graphics >>> ’ t e a f o r t o o ’ . r e p l a c e ( ’ t o o ’ , ’ two ’ ) ’ t e a f o r two ’CGIscriptingPackagingExtensionsGlossaryAnnexes 89
    • Regular expressions MSc BIOINFO2011-2012 # ! / u s r / b i n / env python import r eIntroFunctional # open a f i l eprogram-ming f i l e = open ( " a l i c e . t x t " , " r " )Classes text = f i l e . readlines ( )Exceptions f i l e . close ( )BioPython # compiling the r e g u l a r expression :Graphics keyword = r e . compile ( r " t h e " )CGIscripting # s e a r c h i n g t h e f i l e c o n t e n t l i n e by l i n e :Packaging for l i n e in t e x t :Extensions i f keyword . search ( l i n e ) :Glossary print line ,Annexes 90
    • Regular expressions MSc BIOINFO2011-2012 # ! / u s r / b i n / env python import r eIntro # open a f i l eFunctionalprogram- f i l e = open ( " a l i c e . t x t " , " r " )ming text = f i l e . readlines ( )Classes f i l e . close ( )ExceptionsBioPython # s e a r c h i n g t h e f i l e c o n t e n t l i n e by l i n e : keyword = r e . compile ( r " t h e " )GraphicsCGIscripting for l i n e in t e x t : r e s u l t = keyword . search ( l i n e )Packaging if result :Extensions p r i n t r e s u l t . group ( ) , " : " , l i n e ,GlossaryAnnexes http://docs.python.org/library/re.html http://www.amk.ca/python/howto/regex/ 91
    • MSc BIOINFO2011-2012 Write scripts thatIntro ExampleFunctional Retrieve all lines from a given file that do not contain "the ".program-ming Retrieve all lines that contain "the " with lower or upper caseClasses letters (hint: use the ignore case option)ExceptionsBioPython ExampleGraphicsCGI Retrieve lines from a long sequence (eg, CFTR) that containscripting a given codon, and then a given first and third letter for eachPackaging triadExtensionsGlossary http://www.upriss.org.uk/python/session7.Annexes html#chars 92
    • MSc BIOINFO2011-2012IntroFunctional Exampleprogram-ming Write a script that asks users for their name, address andClasses phone number. Test each input for accuracy, for example,Exceptions there should be no letters in a phone number. A phoneBioPython number should have a certain length. An address shouldGraphics have a certain format, etc. Ask the user to repeat the input inCGIscripting case your script identfies it as incorrect.PackagingExtensionsGlossaryAnnexes 93
    • Classes: Some defs MSc BIOINFO2011-2012 Namespace mapping from names to objects. There is absolutely norelation between names in differentIntro namescapes (different local names in a functionFunctionalprogram- invocation, for example; that is why we prefixming them with the name of the function, for example).ClassesExceptions Scope textual region of a Python program where aBioPython namespace is directly accessible.Graphics Attributes anything you can call in theCGIscripting form:object.attribute (data and methods).Packaging Instance objects created by instantiation of classes.ExtensionsGlossary http://docs.python.org/tutorial/classes.htmlAnnexes http: //pytut.infogami.com/node11-baseline.html 94
    • Global vs local variables MSc BIOINFO # ! / u s r / l o c a l / b i n / python2011-2012 " " " h t t p : / / www. w e l l h o . n e t / r e s o u r c e s / ex . php4? i t e m =y105 / l o c v a # V a r i a b l e scopeIntroFunctional first = 1program-ming def one ( ) :Classes " Double a g l o b a l v a r i a b l e , r e t u r n i t + 3 . "Exceptions global f i r s tBioPython f i r s t ∗= 2Graphics r e s u l t = f i r s t +3CGI return r e s u l tscriptingPackaging print one . __doc__Extensions print one ( )Glossary print one ( )Annexes print one ( ) print " f i r s t now has t h e v a l u e " , f i r s t print " r e s u l t has t h e v a l u e " , r e s u l t 95
    • A first example of a class MSc BIOINFO2011-2012 # ! / u s r / b i n / python " " " house . py −− A house program . """Intro class House ( o b j e c t ) :Functionalprogram- " " " Some s t u f f " " "ming my_house = House ( ) # c l a s s i n s t a n t i a t i o nClasses my_house . number = 40 # data a t t r i b u t eExceptions my_house . rooms = 8BioPython my_house . garden = 1Graphics p r i n t "My house i s number " , my_house . numberCGIscripting p r i n t " I t has " , my_house . rooms , " rooms "Packaging i f my_house . garden :Extensions g a r d e n _ t e x t = " has " else :Glossary g a r d e n _ t e x t = " does n o t have "Annexes p r i n t " I t " , g a r d e n _ t e x t , " a garden " 96
    • A second example of a class MSc BIOINFO # ! / u s r / b i n / python2011-2012 " " " house2 . py −− Another house . """IntroFunctional class House ( o b j e c t ) :program- def _ _ i n i t _ _ ( s e l f , number , rooms , garden ) :ming s e l f . number = numberClasses s e l f . rooms = roomsExceptions s e l f . garden = gardenBioPythonGraphics my_house = House ( 2 0 , 1 , 0 )CGIscripting p r i n t "My house i s number " , my_house . numberPackaging p r i n t " I t has " , my_house . rooms , " rooms "Extensions i f my_house . garden :Glossary g a r d e n _ t e x t = " has "Annexes else : g a r d e n _ t e x t = " does n o t have " p r i n t " I t " , g a r d e n _ t e x t , " a garden " 97
    • Adding methods MSc BIOINFO2011-2012 # ! / u s r / b i n / python " " " square . py −− Make some n o i s e about a square .Intro """Functionalprogram- class Square :ming def _ _ i n i t _ _ ( s e l f , l e n g t h , w i d t h ) :Classes s e l f . length = lengthExceptions s e l f . width = widthBioPythonGraphics def area ( s e l f ) :CGI return s e l f . length ∗ s e l f . widthscriptingPackaging my_square = Square ( 5 , 2 )Extensions p r i n t my_square . area ( )Glossary http://www.ibiblio.org/g2swap/byteofpython/Annexes read/oops.html 98
    • Some terminology MSc BIOINFO2011-2012 A class creates a new type where objects are instances of the class.Intro The ’functions’ that are part of an object are calledFunctionalprogram- methods.ming The fields and methods are called ’attributes’.ClassesExceptions You can examine all the methods and attributes that areBioPython associated with an object using the dir command :Graphics print dir(some_obj)CGIscripting Fields are of two types - they can belong to eachPackaging instance/object of the class or they can belong to theExtensions class itself. They are called instance variables and classGlossary variables respectively.Annexes http://www.voidspace.org.uk/python/articles/ OOP.shtml 99
    • Arrays and classes MSc BIOINFO # ! / u s r / b i n / python2011-2012 " " " person . py −− A person example . """Intro class Person ( o b j e c t ) :Functional def _ _ i n i t _ _ ( s e l f , age , house_number ) :program- s e l f . age = ageming s e l f . house_number = house_numberClassesExceptions alex = [ ]BioPython f o r i i n range ( 5 ) :Graphics o b j = Person ( i , i )CGI a l e x . append ( o b j )scriptingPackaging p r i n t " Alex [ 3 ] age i s " , a l e x [ 3 ] . ageExtensions printGlossaryAnnexes f o r alexsub i n a l e x : p r i n t " Age i s " , alexsub . age p r i n t " House number i s " , alexsub . house_number 100
    • Examples MSc BIOINFO2011-2012Intro ExampleFunctionalprogram- Write a simple program that reads from a CSV file containingming a list of names, addresses, and ages and returns the name,Classes address and age for a particular person upon request.ExceptionsBioPython ExampleGraphicsCGI Extend the above program to include e-mail addresses andscripting phone numbers to the student’s data. (Hint: http:Packaging //www.upriss.org.uk/python/session13.html)ExtensionsGlossaryAnnexes 101
    • Syntax errors MSc BIOINFO2011-2012IntroFunctionalprogram- http://docs.python.org/tutorial/errors.htmlming >>> while True p r i n t ’ H e l l o w o r l d ’Classes F i l e "<stdin >" , l i n e 1 , in ?Exceptions while True p r i n t ’ H e l l o w o r l d ’BioPython ^Graphics SyntaxError : i n v a l i d syntaxCGIscriptingPackagingExtensionsGlossaryAnnexes 102
    • Syntax errors MSc BIOINFO2011-2012 >>> 10 ∗ ( 1 / 0 )Intro Traceback ( most r e c e n t c a l l l a s t ) :Functionalprogram- F i l e "<stdin >" , l i n e 1 , in ?ming Z e r o D i v i s i o n E r r o r : i n t e g e r d i v i s i o n or modulo by zeroClasses >>> 4 + spam∗3Exceptions Traceback ( most r e c e n t c a l l l a s t ) :BioPython F i l e "<stdin >" , l i n e 1 , in ?Graphics NameError : name ’ spam ’ i s not d e f i n e d >>> ’ 2 ’ + 2CGIscripting Traceback ( most r e c e n t c a l l l a s t ) :Packaging F i l e "<stdin >" , l i n e 1 , in ?Extensions TypeError : cannot concatenate ’ s t r ’ and ’ i n t ’ o b j e c t sGlossaryAnnexes 103
    • Handling exceptions MSc BIOINFO # ! / u s r / b i n / env python2011-2012 # # Program t o read and p r i n t a f i l eIntroFunctional import sysprogram-mingClasses try : f i l e = open ( " a l i c e . t x t " , " r " )Exceptions except I O E r r o r :BioPython p r i n t " Could n o t open f i l e "Graphics sys . e x i t ( )CGIscripting text = f i l e . readlines ( )Packaging f i l e . close ( )ExtensionsGlossary for l i n e in t e x t :Annexes print line , print 104
    • Exceptions MSc BIOINFO2011-2012 . . . except ( RuntimeError , TypeError , NameError ) :Intro ... passFunctionalprogram-ming >>> def t h i s _ f a i l s ( ) :Classes ... x = 1/0Exceptions ...BioPython >>> t r y :Graphics ... this_fails () . . . except Z e r o D i v i s i o n E r r o r as d e t a i l :CGIscripting ... p r i n t ’ Handling run−t i m e e r r o r : ’ , d e t a i lPackaging ...Extensions Handling run−t i m e e r r o r : i n t e g e r d i v i s i o n or modulo by zeroGlossaryAnnexes 105
    • A useful case MSc BIOINFO import g e t o p t , sys2011-2012 def main ( ) :Intro try :Functional opts , args = g e t o p t . g e t o p t ( sys . argv [ 1 : ] , " ho : " , [ "program- except g e t o p t . G e t o p t E r r o r :ming # p r i n t h e l p i n f o r m a t i o n and e x i t :Classes usage ( )Exceptions sys . e x i t ( 2 )BioPython o u t p u t = NoneGraphics for o , a in opts :CGI i f o i n ( "−h " , "−−h e l p " ) :scripting usage ( )Packaging sys . e x i t ( )Extensions i f o i n ( "−o " , "−−o u t p u t " ) :Glossary output = aAnnexes #... i f __name__ == " __main__ " : main ( ) 106
    • Exceptions MSc BIOINFO >>> def d i v i d e ( x , y ) :2011-2012 ... try : ... result = x / yIntro ... except Z e r o D i v i s i o n E r r o r :Functional ... p r i n t " d i v i s i o n by zero ! "program- ... else :ming ... print " result is " , resultClasses ... finally :Exceptions ... print " executing f i n a l l y clause "BioPython ...Graphics >>> d i v i d e ( 2 , 1 )CGI r e s u l t is 2scripting executing f i n a l l y clausePackaging >>> d i v i d e ( 2 , 0 )Extensions d i v i s i o n by zero !Glossary executing f i n a l l y clauseAnnexes >>> d i v i d e ( " 2 " , " 1 " ) executing f i n a l l y clause Traceback ( most r e c e n t c a l l l a s t ) : 107 F i l e "<stdin >" , l i n e 1 , in ?
    • User defined exceptions MSc BIOINFO2011-2012 >>> class MyError ( E x c e p t i o n ) : ... def _ _ i n i t _ _ ( s e l f , v a l u e ) :Intro ... s e l f . value = valueFunctional ... def _ _ s t r _ _ ( s e l f ) :program- ... return repr ( s e l f . value )ming ...Classes >>> t r y :Exceptions ... r a i s e MyError ( 2 ∗ 2 )BioPython . . . except MyError as e :Graphics ... p r i n t ’My e x c e p t i o n occurred , v a l u e : ’ , e . v a l u eCGI ...scripting My e x c e p t i o n occurred , v a l u e : 4Packaging >>> r a i s e MyError ( ’ oops ! ’ )Extensions Traceback ( most r e c e n t c a l l l a s t ) :Glossary F i l e "<stdin >" , l i n e 1 , in ?Annexes __main__ . MyError : ’ oops ! ’ 108
    • User defined exceptions MSc BIOINFO2011-2012 class E r r o r ( E x c e p t i on ) : " " " Base c l a s s f o r e x c e p t i o n s i n t h i s module . " " "Intro passFunctionalprogram-ming class I n p u t E r r o r ( E r r o r ) :Classes " " " Exception raised f o r e r r o r s i n the i n p u t .Exceptions Attributes :BioPython expr −− i n p u t e x p r e s s i o n i n which t h e e r r o r o c c u r r eGraphics msg −− e x p l a n a t i o n o f t h e e r r o rCGIscripting """Packaging def _ _ i n i t _ _ ( s e l f , expr , msg ) :Extensions s e l f . expr = exprGlossary s e l f . msg = msgAnnexes 109
    • BioPython MSc BIOINFO2011-2012IntroFunctionalprogram-ming Set of modules and packages for biology (sequence analysis,Classes database access, parsers...).Exceptions http://biopython.org/DIST/docs/tutorial/Tutorial.htmlBioPython http://biopython.org/DIST/docs/api/GraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 110
    • Examples MSc BIOINFO2011-2012 >>> from Bio . Seq import SeqIntro >>> my_seq = Seq ( "AGTACACTGGT" )Functionalprogram- >>> my_seqming Seq ( ’AGTACACTGGT ’ , Alphabet ( ) )Classes >>> p r i n t my_seqExceptions AGTACACTGGTBioPython >>> my_seq . a l p h a b e tGraphics Alphabet ( ) >>> my_seq . complement ( )CGIscripting Seq ( ’TCATGTGACCA ’ , Alphabet ( ) )Packaging >>> my_seq . reverse_complement ( )Extensions Seq ( ’ACCAGTGTACT ’ , Alphabet ( ) )GlossaryAnnexes 111
    • A couple of simple exercises MSc BIOINFO Example2011-2012 Search for CFTR nucleotide sequences in the NCBI server.Intro Save the sequences as FASTA and GeneBank. Using theFunctional SeqIO parser extract the sequences from the files and printprogram-ming them on screen.Classes http://biopython.org/DIST/docs/api/Bio.Exceptions SeqIO-module.html#parseBioPythonGraphics ExampleCGIscripting Download an alignment for the CFTR protein entries fromPackaging PFAM (use the seed for ABC transporters). Using the AlignIOExtensions parser, extract the sequences from FASTA or StocholmGlossary formatted files downloaded from PFAM.Annexes http://biopython.org/DIST/docs/api/Bio. AlignIO-module.html#parse
    • MSc BIOINFO2011-2012 from Bio . A l i g n . Generic import Alignment from Bio . Alphabet import IUPAC , GappedIntro a l p h a b e t = Gapped ( IUPAC . unambiguous_dna )Functionalprogram- a l i g n 1 = Alignment ( a l p h a b e t )ming a l i g n 1 . add_sequence ( " Alpha " , "ACTGCTAGCTAG" )Classes a l i g n 1 . add_sequence ( " Beta " , "ACT−CTAGCTAG" )Exceptions a l i g n 1 . add_sequence ( "Gamma" , "ACTGCTAGDTAG" )BioPythonGraphics a l i g n 2 = Alignment ( a l p h a b e t )CGI a l i g n 2 . add_sequence ( " D e l t a " , "GTCAGC −AG" )scripting a l i g n 2 . add_sequence ( " E p i s l o n " , "GACAGCTAG" )Packaging a l i g n 2 . add_sequence ( " Zeta " , "GTCAGCTAG" )ExtensionsGlossary my_alignments = [ a l i g n 1 , a l i g n 2 ]Annexes See, better, MultipleSeqAlignment 113
    • Converting between sequence alignment formats MSc BIOINFO2011-2012 from Bio import A l i g n I O count = A l i g n I O . c o n v e r t ( " PF05371_seed . s t h " , " stockholm " , " PF05371_seed . a l n " , " c l u s t a l " )Intro p r i n t " Converted %i a l i g n m e n t s " % countFunctionalprogram-ming from Bio import A l i g n I OClasses a l i g n m e n t s = A l i g n I O . parse ( open ( " PF05371_seed . s t h " ) ,Exceptions " stockholm " )BioPython handle = open ( " PF05371_seed . a l n " , "w" )Graphics count = A l i g n I O . w r i t e ( alignments , handle , " c l u s t a l " ) handle . c l o s e ( )CGIscripting p r i n t " Converted %i a l i g n m e n t s " % countPackagingExtensions from Bio import A l i g n I O a l i g n m e n t = A l i g n I O . read ( open ( " PF05371_seed . s t h " ) ,Glossary " stockholm " )Annexes print alignment . format ( " c l u s t a l " ) 114
    • Performing alignments MSc BIOINFO2011-2012 BioPython provides tools for command line execution. For example:Intro >>> import osFunctionalprogram- >>> import subprocessming >>> from Bio . A l i g n . A p p l i c a t i o n s import ClustalwCommandlineClasses >>> h e l p ( ClustalwCommandline )Exceptions >>> c_exe = r " / A p p l i c a t i o n s / c l u s t a l w 2 "BioPython >>> a s s e r t os . path . i s f i l e ( c_exe ) , " C l u s t a l W m i s s i n g "Graphics >>> c l = ClustalwCommandline ( c_exe , i n f i l e = " c f t r . f a s t a " ) >>> r e t u r n _ c o d e = subprocess . c a l l ( s t r ( c l ) ,CGIscripting ... s t d o u t = open ( os . d e v n u l l ) ,Packaging ... s t d e r r = open ( os . d e v n u l l ) , ... s h e l l =( sys . p l a t f o r m ! = " win32 " ) )ExtensionsGlossary http://docs.python.org/library/subprocess.htmlAnnexes http://jimmyg.org/blog/2009/working-with-python-subprocess.html 115
    • Working with streams and subprocesses MSc BIOINFO import sys2011-2012 while 1 : try :Intro i n p u t = sys . s t d i n . r e a d l i n e ( )Functional i f input :program-ming sys . s t d o u t . w r i t e ( ’ Echo t o s t d o u t : %s ’%i n p u t ) sys . s t d e r r . w r i t e ( ’ Echo t o s t d e r r : %s ’%i n p u t )Classes except KeyboardError :Exceptions sys . e x i t ( )BioPythonGraphics >>> subprocess . Popen ( ’ echo $PWD ’ , s h e l l =True )CGI / home / james / DesktopscriptingPackaging >>> subprocess . Popen ( " " "Extensions ... c a t << EOF > new . t x tGlossary ... H e l l o World !Annexes ... EOF ... " " " , s h e l l =True ) 116
    • Dealing with PDB files MSc BIOINFO http://www.biopython.org/DIST/docs/tutorial/Tutorial.2011-2012 html#htoc133IntroFunctionalprogram-mingClassesExceptionsBioPythonGraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes See also [Fufezan and Specht, 2009] 117
    • PDB parsing example MSc BIOINFO2011-2012 >>> from Bio .PDB. PDBParser import PDBParserIntro >>> p a r s e r =PDBParser ( )Functional >>> s t r u c t u r e = p a r s e r . g e t _ s t r u c t u r e ( " t e s t " , " 1WQ1. pdb " )program-ming >>> s t r u c t u r e . g e t _ l i s t ( )Classes [ < Model i d =0 >] >>> model= s t r u c t u r e [ 0 ]Exceptions >>> model . g e t _ l i s t ( )BioPython [ < Chain i d =R> , <Chain i d =G> ]Graphics >>> c h a i n =model [ "R" ]CGI >>> c h a i n . g e t _ l i s t ( )scripting [ < Residue MET h e t = resseq =1 i c o d e = > , <Residue THR h e t =Packaging resseq =2 i c o d e = > , <Residue GLU h e t = resseq =3 i c o d e = > , <Extensions resseq =4 i c o d e = > , <Residue LYS h e t = resseq =5 i c o d e = >GlossaryAnnexes 118
    • Retrieving a PDB file MSc BIOINFO2011-2012 >>> from Bio .PDB import PDBListIntro >>> p d b l =PDBList ( )Functional >>> p d b l . r e t r i e v e _ p d b _ f i l e ( ’ 5P21 ’ )program-ming r e t r i e v i n g f t p : / / f t p . wwpdb . org / pub / pdb / data / s t r u c t u r e s / d i vClasses ’ / Users / j o r d i v i l l a / merda / p2 / pdb5p21 . e n t ’Exceptions http://www.biopython.org/DIST/docs/cookbook/BioPython biopdb_faq.pdf or:GraphicsCGI import u r l l i bscripting def f e t c h _ p d b ( i d ) :Packaging u r l = ’ h t t p : / / www. r c s b . org / pdb / f i l e s /%s . pdb ’ % i dExtensions r e t u r n u r l l i b . u r l o p e n ( u r l ) . read ( )GlossaryAnnexes 119
    • Plotting with Python MSc BIOINFO2011-2012 Matplotlib is the reference tool for plotting 2D data in Python. iPython has a "pylab" mode specific for interacting withIntro matplotlib.Functionalprogram- http://wiki.python.org/moin/ming NumericAndScientific/PlottingClasses http://bmi.bmt.tue.nl/~philbers/8C080/ExceptionsBioPython matplotlibtutorial.htmlGraphics >>> from p y l a b import randn , h i s tCGI >>> x = randn (10000)scripting >>> h i s t ( x , 100)PackagingExtensions The pylab mode offers interaction similar to Matlab.Glossary http://matplotlib.sourceforge.net/ Check alsoAnnexes http://gnuplot-py.sourceforge.net/ 120
    • pyplot MSc BIOINFO2011-2012Intro http://www.scipy.org/PyLabFunctionalprogram- import m a t p l o t l i b . p y p l o t as p l tming plt . plot ([1 ,2 ,3])Classes p l t . y l a b e l ( ’ some numbers ’ )Exceptions p l t . show ( )BioPythonGraphics import m a t p l o t l i b . p y p l o t as p l tCGI p l t . p l o t ( [ 1 , 2 , 3 , 4 ] , [ 1 , 4 , 9 , 1 6 ] , ’ ro ’ )scripting p l t . axis ([0 , 6 , 0 , 20])PackagingExtensionsGlossaryAnnexes 121
    • RPy MSc BIOINFO2011-2012 http://rpy.sourceforge.net/ http://www.daimi.Intro au.dk/~besen/TBiB2007/lecture-notes/rpy.htmlFunctionalprogram- http://rpy.sourceforge.net/rpy2/doc-2.1/html/ming index.htmlClassesExceptions >>> from r p y import ∗BioPython >>> >>> degrees = 4Graphics >>> g r i d = r . seq ( 0 , 10 , l e n g t h =100)CGIscripting >>> v a l u e s = [ r . d c h i s q ( x , degrees ) f o r x i n g r i d ]Packaging >>> r . par ( ann =0) >>> r . p l o t ( g r i d , values , t y p e = ’ l i n e s ’ )ExtensionsGlossaryAnnexes 122
    • working with numpy arrays MSc BIOINFO2011-2012IntroFunctional import numpy as npprogram-ming import m a t p l o t l i b . p y p l o t as p l tClasses # e v e n l y sampled t i m e a t 200ms i n t e r v a l sExceptions t = np . arange ( 0 . , 5 . , 0 . 2 )BioPythonGraphics # red dashes , b l u e squares and green t r i a n g l e sCGI p l t . p l o t ( t , t , ’ r−− ’ , t , t ∗∗2 , ’ bs ’ , t , t ∗∗3 , ’ g^ ’ )scriptingPackagingExtensionsGlossaryAnnexes 123
    • Even before talking on CGI MSc BIOINFO2011-2012 import u r l l i bIntroFunctionalprogram- fwcURL = " h t t p : / / c b b l . imim . es "mingClasses try :Exceptions p r i n t " Going t o Web f o r data "BioPython f w c a l l = u r l l i b . u r l o p e n ( fwcURL ) . read ( )Graphics print " Successful " p r i n t " W i l l now p r i n t a l l o f t h e data t o screen "CGIscripting print " fwcall = " , fwcallPackaging except :Extensions p r i n t " Could n o t o b t a i n data from Web"GlossaryAnnexes 124
    • Even before talking on CGI MSc BIOINFO2011-2012 >>> import u r l l i b 2 >>> f o r l i n e i n u r l l i b 2 . u r l o p e n ( ’ h t t p : / / t y c h o . usno . navy . m iIntro ... i f ’EST ’ i n l i n e or ’EDT ’ i n l i n e : # l o o k f o r EasFunctional ... print l i n eprogram-ming <BR>Nov . 25 , 0 9 : 4 3 : 3 2 PM ESTClassesExceptions >>> import s m t p l i bBioPython >>> s e r v e r = s m t p l i b .SMTP( ’ l o c a l h o s t ’ )Graphics >>> s e r v e r . sendmail ( ’ soothsayer@example . org ’ , ’ jcaesar@exaCGI ... " " " To : jcaesar@example . orgscripting ... From : soothsayer@example . orgPackaging ...Extensions ... Beware t h e I d e s o f March .Glossary ... """ )Annexes >>> server . q u i t ( ) 125
    • MSc BIOINFO2011-2012 # ! / u s r / b i n / env pythonIntro import c g iFunctional p r i n t " Content−Type : t e x t / h t m l n "program-mingClasses print " " " <HTML>Exceptions <HEAD>BioPython <TITLE> H e l l o World < / TITLE>Graphics </HEAD>CGI <BODY>scripting <H1> Greetings < /H1>Packaging </BODY>Extensions </HTML>Glossary """Annexes 126
    • Interface design MSc BIOINFO2011-2012IntroFunctionalprogram-ming 1 EncapsulationClasses 2 GeneralizationExceptionsBioPython 3 Interface designGraphics 4 RefactoringCGIscriptingPackagingExtensionsGlossaryAnnexes 127
    • Extending/embedding Python MSc BIOINFO2011-2012IntroFunctional Python provides bindings to other languages that allow forprogram-ming powerful large project building. CheckClasses http://docs.python.org/extending/index.html forExceptions general information. Check alsBioPython http://pyobjc.sourceforge.net/ for a bridgeGraphics between Python and Objective C, needed for example whenCGIscripting building framework based software.PackagingExtensionsGlossaryAnnexes 128
    • Glossary I MSc BIOINFO problem solving The process of formulating a problem, finding a solution, and expressing the solution.2011-2012 high-level language A programming language like Python that is designed to be easy for humans to read and write. low-level language A programming language that is designed to be easy for a computer to execute; alsoIntro called "machine language" or "assembly language"Functional portability A property of a program that can run on more than one kind of computer.program- interpret To execute a program in a high-level language by translating it one line at a time.ming compile To translate a program written in a high-level language into a low-level language all atClasses once, in preparation for later execution. source code A program in a high-level language before being compiled.Exceptions ob ject code The output of the compiler after it translates the program.BioPython executable Another name for ob ject code that is ready to be executed.Graphics prompt Characters displayed by the interpreter to indicate that it is ready to take input from the user.CGIscripting script A program stored in a file (usually one that will be interpreted). program A set of instructions that specifies a computation.Packaging algorithm A general process for solving a category of problems.Extensions bug An error in a program.Glossary debugging The process of finding and removing any of the three kinds of programming errors.Annexes syntax The structure of a program. syntax error An error in a program that makes it impossible to parse (and therefore impossible to interpret). 129
    • Glossary II MSc BIOINFO exception An error that is detected while the program is running.2011-2012 semantics The meaning of a program. semantic error An error in a program that makes it do something other than what the programmer intended.Intro natural language Any one of the languages that people speak that evolved naturally.Functional formal language Any one of the languages that people have designed for specific purposes, such asprogram- representing mathematical ideas or computer programs; all programming languages areming formal languages.Classes token One of the basic elements of the syntactic structure of a program, analogous to a word in a natural language.Exceptions parse To examine a program and analyze the syntactic structure.BioPython print statement An instruction that causes the Python interpreter to display a value on the screen.Graphics instance A member of a set. loop A part of a program that can execute repeatedly.CGIscripting encapsulation The process of transforming a sequence of statements into a function definition.Packaging generalization The process of replacing something unnecessarily specific (like a number) with something appropriately general (like a variable or parameter).Extensions interface A description of how to use a function, including the name and descriptions of the arguments and return value.Glossary development plan A process for writing programs.Annexes docstring A string that appears in a function definition to document the function’s interface. 130
    • This document’s history MSc BIOINFO2011-2012IntroFunctionalprogram-ming 1 2007 : Original version by Michael A. JohnstonClassesExceptions 2 2008 : modifications and examples added by JVFBioPython 3 A 2010-: LTEX2e version and extensions by JVFGraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 131
    • Sources MSc BIOINFO Style guide for Python code2011-2012 http://www.python.org/dev/peps/pep-0008/Intro Library: http://docs.python.org/library/Functional http://www.thinkpython.comprogram-ming http://diveintopython.org/toc/index.htmlClasses http:Exceptions //docs.python.org/tutorial/introduction.htmlBioPythonGraphics http://openbookproject.net/thinkcs/python/CGI english2e/scripting http://www.penzilla.net/tutorials/python/PackagingExtensions http://www.awaretek.com/tutorials.htmlGlossary http://www.rexx.com/~dkuhlman/Annexes http://code.google.com/edu/languages/ google-python-class/ 132
    • MSc BIOINFO2011-2012IntroFunctionalprogram-ming Fufezan, C. and Specht, M. (2009).Classes p3d–python module for structural bioinformatics.Exceptions BMC Bioinformatics, 10:258.BioPythonGraphicsCGIscriptingPackagingExtensionsGlossaryAnnexes 133