A dynamic object-oriented programming language.
Python is a programming language that lets you
work more quickly and integrate your systems more
effectively. You can learn to use Python and see
almost immediate gains in productivity and lower
Who is using Python?
spider and search engine
Yahoo Maps, Yahoo Groups
Python Success Stories
– Star Wars! : http://www.youtube.com/watch?v=RqhUz2vh6lA
My experience on Python?
– Web Automation testing : MaxQ
– Crawler web news for vertical search : beautifulsoup, lxml, mechanize
– Text/file processing – hadoop?
– scrapy (twisted) vs. gcrawler (gevent)
What Python can do?
• Xml processing
• Web Application
• Off-line computation
• Operation scripts
• NLP Processing
– has strong numeric processing capability : matrix
– Suitable for probability and machine learning code.
– NLTK : nature language tool kit
• data analysis
• machine learning
• Big data : R: http://www.xmind.net/m/LKF2/
Python has a simple, minimal, clean syntax
ﬁnd the roots of a quadratic equation
Easy to get started!
• <<Dive into Python>> : http://www.diveintopython.net/toc/index.html
• Python standard libraries: http://docs.python.org/2/library/index.html
• PyPI : http://pypi.python.org/pypi
– There are currently 33961 packages
• PyCon : http://www.pycon.org/
• Practice, practice, practice
Getting started and Installation
• Windows : find the install package here
• Linux : Generally, python come installed with the operating
system, if not, try
– Centos/redhat : yum install python
– Ubuntu : sudo apt-get install python2.7
– wget http://www.python.org/ftp/python/2.7.5/Python-2.7.5.tgz
(./configure & make & make install)
Using the python interpreter
• “”” (doc string)
• Variables are created when they are assigned.
The name is case sensitive.
• If __name__==“__main__”:
– __name__ is a built-in variable which evaluate to the name of the
– being run directly or being imported?
• A module is a file containing Python definitions and
statements. The file name is the module name with the
suffix .py appended. Within a module, the module’s name is
available as the value of the global variable __name__.
– A module can contain executable statements as well as function
definitions. These statements are intended to initialize the module.
They are executed only the first time the module name is encountered
in an import statement.
– Each module has its own private symbol table, which is used as the
global symbol table by all functions defined in the module.
– When a module named spam is imported, the interpreter first
searches for a built-in module with that name. If not found, it then
searches for a file named spam.py in a list of directories given by the
• Packages are a way of structuring Python's module
namespace by using "dotted module names".
• The __init__.py files are required to make Python treat the
directories as containing packages; this is done to prevent
directories with a common name
What can you do with excel?
• 1. read/write to normal csv file
• 2. use csv module to do it
• 3. pypi search for excel