This is a document that attempts to teach the Python language. It is not a replacement for the offical Python tutorial at http://docs.python.org/tutorial/index.html but adopts a more example driven approach. This tutorial is peppered with exercises and practical seesions. I recommend that you try out the exercises by yourself even if they seem hard at first. After all, the only way to learn to program is to program.
The tutorial assumes some basic familiarity with programming in general and makes some slight references to C and C++ to illustrate some points.
It assumes that you have the python programming language installed on your machine. You can obtain it from the official python site (please select the appropriate format for your platform).
Emphasis is done like so. Language literals
are typeset in a
monospace format. Screen transcripts and ascii graphics are typeset
inside a separate indented box using monospace font. Comments inside
these boxen are italicised and typeset using a slightly lighter
colour. Exercises are typeset in dark grey boxen so that they stand
out.
After a few initial sections, we will use code snippets instead of screen transcripts so you won't see the interpreter prompts.
Python is a very high level multi-paradigm programming language which emphasises programmer productivity and code readability.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - B.W. Kernighan
Fire off the interpreter from your command line. The command used
is python
.
sanctuary% python2.6 Python 2.6+ (r26:66714, Oct 22 2008, 09:25:02) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>The python interactive interpreter can be used as a quick calculator. Parentheses can be used to alter default operator precendences.
2+3 5 0.5*5 2.5 2+3*5 17 (2+3)*5 25
You can also use variables like you do in most other languages. There's not need to declare them before you use them. Variables don't have any sigils (like in perl) or static type declarations (like in C).
x=5 x+3 8 x/2.0 2.5
Trying to access the value of a variable before it is assigned one will cause the interpreter to print an error.
t=t+1 Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 't' is not defined
So far, we've only seen numbers, Python can also deal with strings in a fashion quite similar to numbers. The results are quite intuitive.
c1 = "Coimbatore" c1 + " Bangalore" 'Coimbatore Bangalore' c1*3 'CoimbatoreCoimbatoreCoimbatore' c2 = "Bangalore" c1 + c2 'CoimbatoreBangalore'
You can reassign a variable without worrying about it's type.
x=2 x+2 4 x="Bangalore" x+", Karnataka" 'Bangalore, Karnataka'
Python's if
statement is similar to those in most other
languages. It allows us to make a branching decision.
x = 10 if x<10: print "x is less than 10" elif x>10: print "x is greater than 10" else: print "x is 10" x is 10
Unlike many other contemporary languages, python is not format
free and relies on indentation to group statements together. So,
a block of statements (eg. the body of a function) which you
would demarcate using {
and }
in C would be grouped together
in Python by indenting them all by the same amount. In them
example above, the indentation of the print statements makes
them part of the body of the if
and the else
part. The
elif
is a shorthand for else if
which would be unnecessarily
verbose.
It should also be understood that parts of the code which are not visited (because of a conditional) might contain runtime errors which are uncaught.
x = 2 if x==2: print x else: print y
x = 2 if x==2: print x else: x +
Just like a value can be associated with a name (eg. x = 2
), it's
also possible for a piece of logic to get tied to a name. Such an
association is called a function.
Functions are similar to the mathematical entities by the same
name. They take some inputs and return one or more outputs. They
are defined using the def
keyword.
The following code snippet creates a function that will return the square of it's argument.
def square(x): rte = x*x return rte square(5) 25 x=square(7) print x n49
As you can see, the body of the function is indented so that it 'belongs' to the function. Python doesn't have braces.
Functions can take the place of literal expressions.
print 1+square(7)
50
In Python, functions are first class objects and can be assigned and used like other objects. They can also be passed as arguments to other functions.
other_name = square other_name(7) 49
Python function definitions are executable statements that may appear wherever it is legal for staments to appear. The functions will not get defined unless these statements are executed. As you can see in the example below, foo is not defined because the body of the if didn't execute.
x = 2 if x == 3: def foo(): print "Hello" >>> foo() Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'foo' is not defined
One line anonymous functions can be created using the lambda
keyword.
sq = lambda x:x*x
sq(7)
49
Function arguments can either be positional or keyword. The former is similar to the C syntax and what we have been using so far (ie. First argument goes to first formal parameter etc.). The other way of doing it is to use keyword arguments. The following example illustrates this.
def greet(greeting, person): return "%s, %s"%(greeting,person) greet("Hello", "noufal") # positional 'Hello, noufal' greet(greeting = "Hello", person = "Noufal") # Positional (Same order as definition) 'Hello, Noufal' greet(person = "Noufal", greeting = "Hello") # Positional (Different order) 'Hello, Noufal'
Functions can written to accept default arguments. Consider the
following example of a function called power
which will
raise it's first argument to the second and to 2 if the second
argument is unspecified.
def power(n,pow=2): return n**pow <i># The ** operator is the exponentation operator</i> power(7) # <i> Returns 7<sup>2</sup></i> 49 power(7,3) # <i> Returns 7<sup>3</sup></i> 343
Variables created inside a function have local scope. This means that you don't have to worry about the variables that exist outside a function when you're coding it.
def average(a,b): # Create a function that uses a local variable s s = a+b avg = s/2.0 return avg s = 10 # Create a variable call s average(10,20) # Call our function. s inside the function will be 30. 15.0 s # But we still have it as 10 10
Variables are looked up first in the function local symbol table (locals) and if they're not found there, in the global symbol table (globals).
If you do want to refer to variable that has been initialised
outside the function, you need to use the global
keyword to
tell the interpreter so.
def average(a,b): global s <i># Now we use the global s. Not a function local one</i> s=a+b avg = s/2.0 return avg s=10 average(10,20) 15.0 s 30 <i># The value has changed</i>
It is possible (and recommended) to put a documentation string (or docstring for short) in the header of a function to describe what it does.
def square(x): "Returns the x raised to the power of 2" return x*x square(7) 49 help(square) Help on function square in module __main__: square(x) Returns the x raised to the power of 2
def f(x): return x + x def g(): return f(5) g() ? def f(x): return x * x g() ?
def foo(): def bar(): print "Hello" foo() bar()
In order to facilitate reuse of code, it is possible to organise our functions into modules. These are nothing more than Python files. A large number of modules can be organised into a package. Here is an example
+--------------------------------------------------------------------+ | +--------------------+ +--------------------+ | | | | | | | | |module:colour | |module:brush | | | +--------------------+ +--------------------+ | | +--------------------+ | | | | | | |module:effects | | | +--------------------+ package:graphics | +--------------------------------------------------------------------+ The colour module would be referred to as graphics.colour.
import
keyword.
.py
extension.
import x | Loads x.py and allows access to it's attributes via x |
from x import y | Loads x and puts x.y into the current namespace |
from x import * | Loads x and puts all it's attributes into the current namespace |
noufal@sanctuary% cat numeric.py """ This module contains numeric routines. square : returns the square of the numeric argument """ def square(x): "Returns the square of the given number x" return x*ximport numeric numeric.square(7) 49 square(7) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'square' is not defined
__init__.py
file in the
directory. This is a package.
. +-- numeric | +-- __init__.py | +-- exp.py
from numeric import exp exp.square(5) 25
Python provides many primitive data types.
From this section on, we will use direct code snippets instead of screen grabs of the interpreter sessions.
foo = "Hello" foo = """This is a paragraph of text spread across 4 lines"""
+
operator for concatenation, the *
operator for repetition, the %
operator for substitution and
the []
operator for slicing or indexing.
city = "coimbatore" state = "tamil nadu" print city + "," +state # String concatenation 'coimbatore,tamil nadu' "-" * 50 # Quick way to create a divider line '--------------------------------------------------' print "%s, %s"%(city,state) 'coimbatore, tamil nadu' print city[0] 'c' print city[1] 'o' print city[0:5] 'coimb' print city[-2] 'r'
0 1 2 3 4 5 6 +------+------+------+------+------+------+ | | | | | | | | P | Y | T | H | O | N | +------+------+------+------+------+------+ -6 -5 -4 -3 -2 -1
u
or r
modifiers to indicate that
they are unicode or raw respectively.
print "Tamil\nNadu" # \n is interpreted as a newline Tamil Nadu print r"Tamil\nNadu" # \n is parsed as two separate characters. Tamil\nNadu print type("snake") # type is a builtin that returns the type of an object <type 'str'> print type(u"snake") <type 'unicode'>
Method | Use | Result |
---|---|---|
capitalize | "python".capitalize() | "Python" |
count | "quux".count('u') | 2 |
endswith | "quux".endswith("x") | True |
startswith | "quux".startswith("s") | False |
find | "trek".find('e') | 2 |
split | "graphics.canvas".split(".") | ["graphics","canvas"] |
strip | " Spaced out ".strip() | "Spaced out" |
[
and ]
operators.
foo = ["Python", "Lisp", 0, 1.5] # Construction print foo[1] # Print the first element Lisp print foo[0:2] # Slicing (similiar to strings) ['Python', 'Lisp'] print foo[0][0] # Print first character of the first element "P"
Operation | Example | Result |
---|---|---|
+ | [1,2] + ["Hello"] | [1,2,"Hello"] |
* | [1,2]*3 | [1,2,1,2,1,2] |
All examples below use the following list | ||
x=[1,2,3] | ||
append (in place) | x.append(4) | [1,2,3,4] |
extend (in place) | x.extend([4,5]) | [1,2,3,4,5] |
reverse (in place) | x.reverse() | [3,2,1] |
The following example uses x=[3,2,1] | ||
sort (in place) | x.sort() | [1,2,3] |
foo = [3,2,1] bar = foo.sort() print bar
x = [1,2,3] x[0 ] = x
(
and )
operators.
x=(1,2,3)
print x
(1,2,3)
a,b = (3,4) print a 3 print b 4
{
and }
operators.
[]
operator.
foo = {'city' : 'Coimbatore', 'state' : 'Tamil Nadu', 'country' : 'India'} foo['country'] 'India' foo.keys() ['city', 'state', 'country'] foo.items() [('city', 'Coimbatore'), ('state', 'Tamil Nadu'), ('country', 'India')] foo.values() ['Coimbatore', 'Tamil Nadu', 'India']
for
commonly
used for definite iteration and the other using while
used for
indefinite iteration.
for i in [1,2,3,4]: # i is the loop variable print i 1 2 3 4
x = 0 while x<5: x = x + 1 print x 1 2 3 4 5 while True: # This is an infinite loop! x = x+1
break
keyword stops the loop and comes out of it. It's
useful to terminate a loop prematurely.
continue
keyword stops the current iteration and goes to
the next one.
pass
keyword is a do nothing placeholder and is commonly
used in loop and function stubs. It's similar to the ;
statement in C.
for i in range(1,10): pass print i
We skipped a useful bit of the language regarding functions in our earlier dicussion because we did't have knowledge of the primitives needed. We will cover it here.
Functions usually recieve a fixed number of position (and keyword)
arguments. However, we can write functions that recieve an
arbitrary number of positional arguments using the *
operator.
def sum(*items): acc = 0 for i in items: acc += i return acc sum(1,2,3,4,5,6) 21 sum(1,2,3) 6
The items
variable will be a Python list that contains all the arguments.
Similarly, we can make a function that receives an arbitrary number of keyword argumets like so
def init(**params): print params.keys() init(foo = 1, bar = 2) ['foo','bar']
every
which will return True if all
it's arguments are True
and False
if not.
print
keyword. It will output the string representation of it's
argument plus a newline to standard output.
raw_input
function.
x = raw_input("Enter your name :") # Will prompt the user for some input and block Enter your name :Noufal print "Hello %s"%x "Hello Noufal"
file
(or open
) constructor. It
receives the name of the file followed by a mode specification
string indicating whether the file is to be opened for reading,
writing or appending and whether it is binary or textual.
write
,
writelines
, read
and readlines
methods to put and get
data.
f = open("/tmp/foo.txt","w") f.write("This is a sample") f.close() print open("/tmp/foo.txt","r").read() "This is a sample"
raise
an 'exception'. These
can be caught and processed accordingly. Exceptions are of
different kinds and will be discussed in detail later.
foo = {'name' : 'Noufal'} try: print foo['age'] except KeyError: print 'No such key "age"' 'No such key "age"'
Python's typing system is dynamic but strong and it's approach to typing is called duck typing (from 'if it looks like a duck and talks like a duck, chances are that it's a duck').
This means that given an object, it's semantics are determined by the interfaces it provides rather than any extra type information held by the object.
For example, if we have a function that doubles it's argument like so
def double(x): return x*2
it will work perfectly for numbers. If we say something like
double(5)
, we will get back 10
. It will also work fine for
strings. If we say double("bam")
, we will get bambam
. In a
statically typed language like C, we would have to declare the type
of x
and thereby force the double
function to accept only
objects of the declared type. The upshot of this is that we would
have to declare two function double_int
and double_string
.
In python however, we don't care. If the object in question (the
thing that x
refers to) supports the *
'protocol', we just use
it and don't really worry about the type. There's no need of a
common abstract parent class which defines interfaces or anything
of the sort.
As Alex Martelli (one of the senior Python programmers) said in a mailing list posting.
In other words, don't check whether it IS-a duck: check whether it QUACKS-like-a duck, WALKS-like-a duck, etc, etc, depending on exactly what subset of duck-like behaviour you need to play your language-games with.If
x
can be multiplied by a number, that's enough.
This simplifies a lot of details but moves responsibility to the
programmer. For example, there is an 'iteration' protocol (which
we'll discuss later) that the for
construct uses. This allows it
to iterate over anything that supports this protocol. A simple
example is shown below
for i in "python": # Iteration over a string. Prints one character per line print i for i in [1,2,3,4]: # Iteration over a list. Prints one element per line print i for i in (1,2,3,4): # Iteration over a tuple. Prints one element per line print i for i in {'language':'Python', 'creator':'Guido'}: # Iteration over a dictionary. Prints one key per line print i for i in 2: # Integers don't support the iteration protocol. print i Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'int' object is not iterable
As long as the object we're dealing with object that support the iteration protocol, the looping construct will work fine. As you can see, it works fine for many of the primitive types but doesn't for integers.
The upshot of this is that in Python, we don't really check for types ('if this is an integer, do this, else do that'). We just go ahead and use that aspect of the object in question which we're interested in. If it doesn't support that, an exception gets raised and we deal with it appropriately. This provides an extremely high degree of flexibility. We can write 'generic' functions that transform their input without worrying about what the innards of the objects that we process are.
The downside is that since the type system is lazy and dynamic, the only way of catching errors upfront is to write detailed unit tests along with your program. The "if it compiles, it's good to go" philosophy will not work. We will discuss unit testing later.
Every language has it's own stylistic trends. This section lists a bunch of them that are quite common. The list is not exhaustive but should give you an idea about how things are. A few small new language constructs are mentioned and we'll discuss them as we cover them.
foo = ["Perl", "C", "Ruby", "Python", "Java"] for idx,lang in enumerate(foo): # Iterating over a list of tuples. if lang == "Python": print "Found at position %s"%idx Found at position 3
names = ["Python", "Perl", "Java"] leads = ["Guido", "Larry", "Gosling"] for lang, lead in zip(names, leads): # The zip function compresses n iterables into n-tuples. print "%10s | %10s"%(lang,lead) # The string format specifiers have provisions for field widths. Python | Guido Perl | Larry Java | Gosling
When a module is imported, the name of the module is available
in a special variable called __name__
. If the code is in the
main entry point module, the __name__
variable will contain
the string __main__
.
This allows us to write modules which can behave differently when run and when imported. The idiom is illustrated below
# This is the quux.py module def main(): # do_something if __name__ == "__main__": main()
If this module is run from the command line, the main
function
will get called (since the value of __name__
is __main__
)
If on the other hand, it's imported, __name__
will be quux
and so the execution of main
will not occur.
It's common to include some tests with 'import only' modules that are run if you just run the module from the command line. A convenient way to test your modules.
The docstring for functions allows us to briefly describe the
usage and purpose of a function. It's also possible to embed an
example usage cut and pasted from the interpreter prompt into
the docstring. Once this is done, Python had a standard module
called doctest
that allows us to 'test' these functions. A
useful way of making sure that the examples in your docstrings
are upto date.
# This is the foo.py module def double(x): """ Returns the double of it's argument >>> double(5) 10 >>> double("Hello") 'HelloHello' >>> double(3.5) 7.0 >>> """ return x*2 if __name__ == "__main__": import doctest doctest.testmod()
noufal@sanctuary% python foo.py noufal@sanctuary% # Edit the module to change 7.0 in the example to 7.1 noufal@sanctuary% python foo.py ********************************************************************** File "foo.py", line 7, in __main__.double Failed example: double(3.5) Expected: 7.1 Got: 7.0 ********************************************************************** 1 items had failures: 1 of 3 in __main__.double ***Test Failed*** 1 failures.
While it's not possible to write doctests for all your functions, it's a useful habit to write them whenever you can. It's easy too since you just import your module, try your function and cut/paste the interpreter session into the docstring. This way you'll surely have some tests for your functions.
Empty strings, lists, tuples, dictionaries etc. evaluate to
False
. Hence, it's a common practice to use
if not foo: process(foo)
rather than say
if len(foo) != 0: process(foo)
While the pythonic way to iterate over a list is to use a for directly over it, sometimes, we need the index of elements as well as the actual element. This can be done as follows
for i in range(len(foo)): # range is a function that returns a list from 0 upto it's argument print i, foo[i]
it's more pythonic to say
for idx, i in enumerate(foo): print idx, i
It's syntactically valid but in bad form to say
if foo : print "Hello"
rather than
if foo: print "Hello"
It's possible to assign multiple varibles from a list in a single neat shot.
foo = [1,2,3] a,b,c = foo a 1 b 2 c 3
An interesting side effect is to say
a,b = b,a
to swap two variables.
Suppose you're reading out strings from a file to create a single large string, the naive way of doing it would be like so.
strng = '' for i in fp: # fp is a file strng += i print i
Since strings are immutable, a new one is constructed for each addition operation. Instead, the pythonic way is to say
"".join(list(fp)) # Join the elements of the list together into a single string separated by ""
More of these snippets are available at David Goodger's Code like a Pythonista.
Functional programming is a programming paradigm which treats computation as the evaluation of mathematical functions. It emphasises lack of state of immutable objects.
The paradigm is well supported in Python. While not a panacea, functional programs are often more compact, understandable, faster and elegant than their procedural or object oriented counterparts.
While we won't dwell into functional programming in depth, we will discuss some of the primitives Python provides that makes this style of programming possible.
We have already encoutered lambda
. The keyword that allows us
to to create one liner functions.
The map
function allows us to apply a function to every element
of an interable. Let's use our double
function to double all
elements of a list.
nat = [1,2,3,4,5]
map(double, nat)
[2, 4, 6, 8, 10]
As you can see, this is superior to the procedural method shown below in terms of concision
doubles = [] for i in nat: doubles.append(double(i)) print doubles [2, 4, 6, 8, 10]
The filter
function has a similar call signature as map
but
returns all elements of it's input list which satisfy a certain
predicate. For example, suppose we wanted a list of all even
numbers from a list, we could it as below.
foo = [10,13,20,19,152,1003] filter(lambda x:x%2 == 0, foo) # We create a quick 'even number checker' on the fly. [10, 20, 152]
A combination which gives us the squares of all even numbers less than 20 is shown below
>>> map(lambda x:x*x, filter(lambda x:x%2 == 0, range(1,20))) # range(a,b) is a primitive than returns a list of numbers from a to b [4, 16, 36, 64, 100, 144, 196, 256, 324]
The reduce
function allows us to apply a function on the first
element of a list and the next and reduce it into a single
one. This is repeated till the iterable reduces into a single
element. The following example finds the sum of the first 10
integers.
reduce (lambda x,y:x+y, range(1,10), 0) 45
The first argument is the reduction function. The second argument is the iterable and the third is the initial value. Compare this to the procedural equivalent shown below
acc = 0 # Analogus to the third argument of reduce for i in range(1,10): # Analogus to the second argument of reduce acc += i # Analogus to the first argument of reduce print acc 45
Not only do we have an unnecessary variable acc
, the thing is
much larger.
sum
which will compute
the sum of it's arguments.
While these methods are fine, they can be a little hard to read if overused. Our example of squaring all even numbers was stretching it a little. Python being a language that emphasises readability has a neater notation for these things called list comprehensions.
The expressions are used to generate one list from another and mimic the mathematical set builder notation used for defining sets (set comprehensions).
A list comprehension to do our function of squaring all even numbers from 1 to 20 is shown below.
print [x*x for x in range(1,20) if x%2 == 0] [4, 16, 36, 64, 100, 144, 196, 256, 324]
The general format is as follows (the if
part at the end is
optional). The transformation is the equivalent of map
and
the condition of if
.
[ transformation for var in iterable if condition ]
As you can see, this is much more readable than the map
/
filter
combination.
These are very commonly used in Python and you should familiarise yourself with them.
Higher order functions are programming language equivalent of the mathematical composition operation.
This is a typical functional scenario where the system is built from the bottom up (ie. create lots of small utilities and then composite them into larger structures and finally the application) as opposed to the top down methods where the whole system is broken down into pieces and then sub divided till the components are small enough to be implemented.
Let us look at some examples.
Suppose we have a bunch of functions as follows which we use to compute 2x2 for all even numbers between 1 and 10.
def double(x): "Doubles it's argument" return x*2 def square(x): "Raises it's argument to 2" return x*x def evenp(x): "Returns True if x is even. False otherwise" return x%2 == 0 print [double(square(x)) for x in range(1,10) if evenp(x)] [8, 32, 72, 128]
Suppose we want to count the number of times these functions have been called and put these numbers into a dictionary.
One naive way is to put a global dictionary into the program and then alter each of these functions to increment a count everytime they are called. This is laborious since we have to manually modify every function which we want counted. When it's time for the counting logic to go, we have to manually remove the instrumentation.
Instead, we will define a function like so
fncounts = dict(double = 0, square = 0, evenp = 0) # the dict constructor allows us to create dictionaries # in a cleaner way than by using { and }. def count(fn): def instrumented_fn(x): fncounts[fn.__name__] += 1 # function.__name__ is a special variable that contains the name of the function. # This increments the corresponding member of the dictionary return fn(x) return instrumented_fn
And we instrument our functions like so
double = count(double) square = count(square) evenp = count(evenp)
Now when we're done with our loop, you can see what happens.
fncounts {'evenp': 0, 'square': 0, 'double': 0} print [double(square(x)) for x in range(1,10) if evenp(x)] [8, 32, 72, 128] fncounts {'evenp': 9, 'square': 4, 'double': 4}
Similar to these are function closures which can loosely be defined as first class functions with free variables.
Consider a simple function to calculate fibonacci numbers. We
want a function fib(n)
which will calculate the /n/th fibonacci
number.
def fib(n): "Returns the nth fibonacci number" assert(n>0) # To make sure that we recive only numbers above 0 if n == 1: return 0 if n == 2: return 1 return fib(n-2) + fib(n-1)
This is simple enough. Since it's a recursive definition, it would be interesting to know what a certain call to this function looks like. Let's write a function tracer.
def trace(fn): fn.indent = 0 def traced_function(n): print "| "*fn.indent + "+-- %s(%s)"%(fn.__name__,n) fn.indent += 1 ret = fn(n) print "| "*fn.indent + "+-- [%s]"%ret fn.indent -= 1 return ret return traced_function
The function doesn't do much. Just keeps track of the nesting of function call and draws a text graph of the function call, it's parameters and finally, it's return value. This gives us an idea of deep the tree is
Let us instrument our fibonacci function and use it to trace
fib(5)
fib = trace(fib) fib(5) +-- fib(5) | +-- fib(3) | | +-- fib(1) | | | +-- [0] | | +-- fib(2) | | | +-- [1] | | +-- [1] | +-- fib(4) | | +-- fib(2) | | | +-- [1] | | +-- fib(3) | | | +-- fib(1) | | | | +-- [0] | | | +-- fib(2) | | | | +-- [1] | | | +-- [1] | | +-- [2] | +-- [3] 3
This is wasteful. You can see that fib(3) is being called twice and it's whole tree of descendants as well. We already know the value of fib(3) by the time the second call is made so why can't we reuse it?
One way of doing this is to write a cached version of
fib
. Something like the version shown below.
cache = {} def fib(n): "Returns the nth fibonacci number" if n in cache: return cache[n] assert(n>0) # To make sure that we recive only numbers above 0 if n == 1: return 0 if n == 2: return 1 ret = fib(n-2) + fib(n-1) cache[n] = ret return ret fib=trace(fib) fib(5) +-- fib(5) | +-- fib(3) | | +-- fib(1) | | | +-- [0] | | +-- fib(2) | | | +-- [1] | | +-- [1] | +-- fib(4) | | +-- fib(2) | | | +-- [1] | | +-- fib(3) | | | +-- [1] | | +-- [2] | +-- [3]
This solves our conundrum but we wrote a special purpose cached version of fib and introduced a global variable It would be nicer if we could have a function that would create a cached version of anything we gave it.
Let's call that function memoise
. Here it is.
def memoise(fn): cache = {} def memoised_fn(x): if x in cache: return cache[x] else: ret = fn(x) cache[x] = ret return ret memoised_fn.__name__ = fn.__name__ # To print the original function name in any trace methods return memoised_fn
This is an example of a closure. The memoise function keeps some
state encapsulated inside it (the cache
variable). It's not
visible globally but is shared across all invocations of the
memoised_fn
so that we can stay away from recomputing
values. The memoisedfn has a free variable (ie. cache
).
It also has the added advantage that the cache persists between calls so that once we compute fib(n), we will get instantly the next time.
With our original functions, here is the call tree
fib = trace(memoise(fib)) fib(5) +-- fib(5) | +-- fib(3) | | +-- fib(1) | | | +-- [0] | | +-- fib(2) | | | +-- [1] | | +-- [1] | +-- fib(4) | | +-- fib(2) | | | +-- [1] | | +-- fib(3) | | | +-- [1] | | +-- [2] | +-- [3]
You can see how we constructed a generic set of pieces to trace and memoise functions and used them to build up an efficient version of our original function which is unaware of all this being done to it.
This kind of usage is so common in Python that there is a special notation for it. After our 'function modifiers' are defined, we could have defined fib like so.
@trace @memoise def fib(n): "Returns the nth fibonacci number" assert(n>0) # To make sure that we recive only numbers above 0 if n == 1: return 0 if n == 2: return 1 return fib(n-2) + fib(n-1)
The @ operation is called decoration and trace and memoise are
called decorators. The above is equivalent to first defining
fib and then using a fib = trace(memoise(fib))
statement
datetime
module to measure time.
In python you can unpack a list or a dictionary and provide them as positional or keyword arguments to a function.
In the following example, we have a function greet
that expects
3 arguments viz. name, city and gender. It will, based on the
inputs generate and appropriate greeting message. Our inputs are
read out from a file and we obtain them as strings of the form,
"name : city : gender".
Here is a program to create greetings for the people.
def greet(name, city, sex): """Returns a greeting designed for the person whose details are provided""" pronoun = dict(male = "him", female = "her")[sex] return "Presenting %s of %s! We welcome %s to our fair city."%(name.capitalize(), city.capitalize(), pronoun) input = ["vladimir : moscow : male", "fathima : cairo : female", "john:London: male"] for i in input: print greet(*[x.strip() for x in i.split(":")]) 'Presenting Vladimir of Moscow! We welcome him to our fair city.' 'Presenting Fathima of Cairo! We welcome her to our fair city.' 'Presenting John of London! We welcome him to our fair city.'
The line greet(*[x.strip() for x in i.split(":")])
splits the
input string on the :
character and then strips off the extra
spaces which might be there. It then uses the *
prefix to expand
the list into positional arguments for the greet
function.
The situation is similar for dictionaries with keyword arguments.
The important thing to be understood here is that using this mechanism, it is possible to dynamically construct argument lists to a function.
When coupled with functions that can accept an arbitrary number of parameters, a lot of interesting possibilities arise
Most functions return a single value. Sometimes however, a function needs to return a series of values. This is often handled by running the function and creating a list which is then returned. This is okay but very often, we might not need the whole list. We might search for the first element that satisfies a condition and stop looking ahead after that. The computation of the whole list was unnecessary and wasteful.
A naive way to solve this would be write a completely different version of the function which computes numbers one by one and returns the value when a certain condition is matched. This is not very nice since our conditions might change and the function shouldn't have to know about what we're doing with it's return value.
Python solves this problem using the concept of generators which are functions that can be suspended in mid-execution and resumed later.
Imagine a log file which contains multiple lines of the following format.
time : user : number of MB transferred
Suppose I want to sum the amount of data transferred.
The first would be accomplished by a function like this
def calc_total(logfile): "Returns total data transferred" fp = open(logfile) total = 0 for i in fp: date, user, amount = [x.strip() for x in i.split(":")] total += int(amount) fp.close() return total
Suppose I want to return the first person who transferred more than 1000 MB.
def get_leecher(logfile): "Return first person who transferred more than 1000 MB" fp = open(logfile) for i in fp: date, user, amount = [x.strip() for x in i.split(":")] if amount > 1000: fp.close() return user
These two are special purpose and both of them need the list of
records in the file. Let's try to abstract that out with a
parse_log
function.
def parse_log(logfile): "Returns a list of records in the logfile as a (date, user, transfer) tuple" fp = open(logfile) records = [] for i in logfile: date,user,data = [x.strip() for x in i.split(":")] records.append([int(date), user, int(data)]) return records
Our functions would then, instead of parsing the files themselves
say for i in parse_log(logfile):
and use the records directly.
Suppose our file had 1000000 records. The above function is fine for total counting since you need all the records anyway. For the second however, if our first leecher was the 250th person, it would have been a waste to find the rest of the people.
Instead of generating the whole list and returning it, we can
alter parselog to work as a generator instead of a regular
function. This is done using the yield
keyword instead of
return. The function would look like this
def parse_log(logfile): "Returns a list of records in the logfile as a (date, user, transfer) tuple" fp = open(logfile) for i in fp: date,user,data = [x.strip() for x in i.split(":")] yield [int(date), user, int(data)]
If try to print the return value of the generator, the inteepreter
will tell us that it's a generator rather than a
list. Generators have a .next
method which returns the next
value until, when there are no more elements to produce raises a
StopIteration
exception. We shall use these contents in a file
called foo.log' to illustrate
100 : umar : 2345 120 : ali : 500 150 : zaid : 1024 170 : logan : 543 200 : scott : 213
records = parse_log("foo.log") print records <generator object parse_log at 0x14148c0> sum([x[2] for x in records], 0) # Sum is a primitive function that # adds all the items in it's first # argument and uses it's second # argument as an initial value 4625 # The generator has now been 'used up' and needs to be reinitialised # if we want to use it again. records = parse_log("foo.log") for date,name,data in records: # Lines are read out from the file on demand if data > 1000: print nameGenerators are very commonly used in Python and therefore, a shorthand way of creating them exists. A list comprehension but bracketed using
(
and )
instead of [
and ]
creates a
generator instead of a proper list. Our generator therefore
could be written as follows. It's slightly different from our
original version. Can you say how?
([x.strip() for x in y.split(":")] for y in open(logfile))
cycle
that when given any iterable will
return values from the iterable over and over again forever. eg. If I
say cycle([1,2,3])
, it should return a generator that
will produce values like this 1,2,3,1,2,3,1,2,3...
Python fully supports object oriented programming but in a way that's a lot simpler than other languages like C++.
As in most cases, the simplest way to start is to take an example. Let us create a class to handle complex numbers.
class Complex(object): "Complex number class. Version 1" def __init__(self,real=0,imag=0): #__init__ is the constructor method. self.real = float(real) self.imag = float(imag) def display(self): sign = self.imag > 0 and "+" or "-" print "%s %s %sj"%(self.real, sign, abs(self.imag)) t = Complex(3,4) t.display() 3.0 + 4.0j t = Complex(3,-2) t.display() 3.0 - 2.0j
Small as it is, this example needs some explanation. Classes
are defined using the class
keyword. Objects are instances of
classes (eg. 3+4j is an object of the complex class). The
general syntax is as follows.
class classname (base class 0,base class 1,...):
We call our class Complex
. The object
inside the brackets asks
Python to inherit this class from the basic object class so that
the hierarchy is maintained. Skipping this will make your class an
old style class support for which has been dropped from Python
3.0 onwards. It's a legacy feature which you should not use
anymore. Details of the differences between old and new style
classes is beyond the scope of this document but those interested
in the details can visit http://www.python.org/doc/newstyle/.
Like functions, classes too can have a docstring.
Functions defined inside the class are class methods. In Python unlike C++, all methods are public. Python eschews the need to class to hide it's innards and adopts a 'we are all consenting adults' outlook. While this may seem strange and dangerous to C++ or Java programmers, the gains this kind of construct gives Python (especially for introspection and documentation) are really great.
A point to note is that in Python, all class methods will receive
an extra first argument which holds the object through which the
method was called. This is analogus to the this
pointer in
C++. This first argument is conventionally called self
. This is
not a language rule but an almost uniform convention and you would
do well to abide by it. Accessing object level members is done
through the self pointer. This 'extra' first argument is the only
real difference between a class method and a regular function.
Special methods (eg. constructors etc.) have a trailing and
leading __
. We will see other special methods later in this
tutorial. __init__
refers to the constructor and it is the
method called when a class is created.
The display method which we created allows us to output a printable version of the class so that we know what's happening.
This is all nice but it's quite useless to have just a class that can print itself. Let's make it do something useful like perform simple arithmetic (addition and subtraction).
class Complex(object): "Complex number class. Version 2 (with addition and subtraction)" def __init__(self,real=0,imag=0): #__init__ is the constructor method. self.real = float(real) self.imag = float(imag) def display(self): sign = self.imag > 0 and "+" or "-" print "%s %s %sj"%(self.real, sign, abs(self.imag)) def __add__(self,addend): # Special method equivalent to add return Complex(self.real + addend.real, self.imag + addend.imag) def __sub__(self,subtractend): # Special method equivalent to subtract return Complex(self.real - subtractend.real, self.imag - subtractend.imag) c1 = Complex(5,6) c2 = Complex(1,2) s = c1 + c2 d = c1 - c2 s.display() 6.0 + 8.0j d.display() 4.0 + 4.0j
Now we have addition and subtraction. This is the way Python does operator overloading. In fact When you do
x+y
what's internally happening is a function call.
x.__add__(y)
So you can see that this kind of operation is intuitive. In fact, all the things that can be done to an object are all implemented using such special methods.
Adding the following two methods to our class would give us nice
string (str
) and programmer (repr
) representations.
def __str__(self): sign = self.imag > 0 and "+" or "-" return "%s %s %sj"%(self.real, sign, abs(self.imag)) def __repr__(self): return "Complex (%s, %s)"%(self.real, self.imag) t=Complex(4,5) print t # Uses str(t) 4.0 + 5.0j t # Uses repr(t) Complex (4.0, 5.0)
The string represenation is what you get when you try to convert
the object into a string using the str
function. This is what
print
does internally.
The repr
function converts objects into a representation that
useful for programmers to undertand what it contains. Often,
it's in a format which can be cut/pasted back into the
interpreter.
For the complete list of such special methods, please refer http://www.python.org/doc/2.6/reference/datamodel.html#special-method-names
This concludes our introduction to classes with a note that this class is quite useless in real life since Python has an inbuilt complex primitive type.
Let us consider a class that models a storage device. We'll assume that it has the following attributes
We'll also assume that it has the following methods
Now imagine two types of storage devices. A network storage device and a physical disk. They should all implement these basic features. The network device will also have an extra method to connect to the remote device. We can model it as follows.
+-------------------+ | | +-------------------+ | StorageDevice | +-------------------+ / \ / \ / \ +----------------+ +----------------+ | | | | | | | | +----------------+ +----------------+ | NetworkDevice | | LocalDevice | +----------------+ +----------------+
Now we can write classes for these 3 blocks like so
class StorageDevice(object): "Provides a base class for StorageDevices." def __init__(self,capacity,speed): self.capacity = capacity self.speed = speed def __repr__(self): return "%s(capacity = %s, speed = %s)"%(self.__class__.__name__, self.capacity, self.speed) def write(self,data): # This makes this class an abstract base class. This function can't be called. raise NotImplementedError("Can't write to an abstract device") def read(self): raise NotImplementedError("Can't read from an abstract device") def sync(self): raise NotImplementedError("Can't sync an abstract device") class NetworkDevice(StorageDevice): # Notice the base class "Implements a NetworkDevice" def __init__(self,capacity,speed): super(NetworkDevice,self).__init__(capacity,speed) # We'll discuss this. It's basically calling the base class constructor. def write(self, data): print "Wrote '%s' to the network device"%data def read(self): return "some data" def sync(self): print "Syncing disks" return True # Boolean True. It's a builtin constant def ping(self): "Pings network server" print "Pinging remote server" return True class LocalDevice(StorageDevice): # Notice the base class "Implements a LocalDevice" def __init__(self,capacity,speed): super(LocalDevice,self).__init__(capacity,speed) def write(self, data): print "Wrote '%s' to the local device"%data def read(self): return "some local data" def sync(self): print "Syncing local disks" return True # Boolean True. It's a builtin constant t=StorageDevice(12,12) t.write(312) --------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) /home/noufal/<ipython console> in <module>() /home/noufal/<string> in write(self=StorageDevice(capacity = 12, speed = 12), data=312) NotImplementedError: Cant write to an abstract device t=NetworkDevice(12,32) s=LocalDevice(12,34) for i in [t,s]: i.write("Hello") i.sync() Wrote 'Hello' to the network device Syncing disks Wrote 'Hello' to the local device Syncing local disks print repr(t), repr(s) NetworkDevice(capacity = 12, speed = 32) LocalDevice(capacity = 12, speed = 34)
Let's go over the things that are new here. First of all, the base
class is no longer object
for the LocalDevice
and
NetworkDevice
classes. We inherit from StorageDevice.
We raise a NotImplementedError
exception in the methods of the
base class so that the class can't be used as is. We tried to use it
and got a traceback.
In in the constructor method, we call the super
builtin. It's
function is to return a proxy that delegates a method call back to
the base class. Since it does the delegation at runtime, it's
possible to create interesting inheritance patterns (diamond etc.)
in Python.
Finally, you can see that in the loop, we don't care whether i
is
a NetworkDevice or a StorageDevice (or a book or something else that
implements write). We use duck typing and just call the method we're
interested in.
Its common practice to praise OOP for encapsulation ie. how it prevents you from accessing internal state of an object.
C++ ensures this by erecting a steel wall around the objects
private members only to have the novice programmer expose them
using methods like setUserName
and getUserName
. These are
called getters and setters and Python eschews them. Instead,
we use properties. We should carefully consider what we
expose and not how.
Essentially, a property is a way of altering the semantics of an
attribute access. It allows us to replace the access of an
attribute (say x
) using the .
operator with a function call
but without changing any of the calling code. x
is then called a
managed attribute. This is done using the property
keyword.
Let's create a simple class that represents a person. It has an age attribute that must be managed (ie. cannot be negative).
We do it like this.
class AgeException(Exception): pass class Person(object): def setAge(self,value): if value > 0: self.__age = value else: raise AgeException("Age should be greater than 0") def getAge(self): return self.__age age = property(getAge, setAge, doc = "Age of the person") t=Person() t.age = 10 print t.age 10 t.age = 0 --------------------------------------------------------------------------- AgeException Traceback (most recent call last) /home/noufal/notes/<ipython console> in <module>() /home/noufal/notes/<string> in setAge(self=<__main__.Person object at 0x99677cc>, value=0) AgeException: Age should be greater than 0 print t.age 10
The property
builtin accepts 4 arguments in this order. The
getter, the setter, the deleter and the docstring for the
attribute. If any of these are left blank, that operation is
deemed illegal.
In python, exceptions are also classes and they have a hierarchy. Behold!
Exception(*) | +-- SystemExit +-- StandardError(*) | +-- KeyboardInterrupt +-- ImportError +-- EnvironmentError(*) | | | +-- IOError | +-- OSError(*) | +-- EOFError +-- RuntimeError | | | +-- NotImplementedError(*) | +-- NameError +-- AttributeError +-- SyntaxError +-- TypeError +-- AssertionError +-- LookupError(*) | | | +-- IndexError | +-- KeyError | +-- ArithmeticError(*) | | | +-- OverflowError | +-- ZeroDivisionError | +-- FloatingPointError | +-- ValueError +-- SystemError +-- MemoryErrorIf we have an
except
statement that catches an exception of type
X, we will also catch all exceptions whose base class is
X. Therefore, if we catch LookupError
, we will also catch
KeyError
and IndexError
.
If we want to create a custom exception, we need to inherit it from one of the standard expceptions.
In this section, we'll have a quick overview of a couple of standard modules.
The treatment is not meant to be comprehensive but to give you a birds eye view of the richness of the standard library so that you know what you have before you rush off to reimplement the wheel.
Links are there following each subsection where you can find the official documentation.
These modules are used by import
ing them into your program and
using the functions, classes and other attributes they provide.
If you want to know all the attributes of a given object (including
those of a module), you can use the dir
function. Most of the
attributes have documentation so if you want, you can write a quick
documentation grabber like so.
def gen_doc(obj): "Rips out all documentation for a module and prints it out neatly" for i in dir(obj): attrib = getattr(obj,i) # We're not interested in the docs of integers and strings if not isinstance(attrib,str) and not isinstance(attrib,int): print i,"(",type(attrib),"):" print "-" * (len(i) +len(str(type(attrib))) + 3),"\n" doc = attrib.__doc__ # If documentation is available, print it. if doc: # Truncate docs if they're too long if len(doc) < 90: print " ",doc else: print " ",doc[0:90]," [truncated...]" print "="*80
This introduces the getattr
function which is used to access
attributes of object of which you know the names (ie. the name is in
a string). In other words,
t=Person()
print t.age
is the same as
attribute = "age" t=Person() print getattr(t,attribute)
It also introduces the isinstance
function which is used to test
if a certain object is of a certain type. It handles inheritance
properly. ie. a subclass is of the same type as the base class.
For each of these modules, you should look at the documentation on the offical python page (http://docs.python.org/modindex.html) as well as the PyMOTW (Python Module Of The Week Page) for the module by Doug Hellmann the index of which is at http://www.doughellmann.com/PyMOTW/contents.html
Let's get started.
The sys
module contains things which affect the interpreter
operation. Here are some of them with descriptions.
Attribute | Description |
---|---|
sys.argv | Argument list (similar to C's argv) |
sys.path | List of paths which the interpreter will look for modules to import |
sys.exit() | Quits the interpreter |
sys.exitfunc | Function to call upon exit (useful for cleanup) |
sys.ps1, sys.ps2 | Interpreter prompts |
sys.stdin, sys.stdout, sys.stderr | Standard input, output and error file descriptors |
sys.version | Python version |
import sys print sys.path ['', '/usr/lib/python2.5/site-packages/SQLAlchemy-0.5.1-py2.5.egg', '/usr/lib/python2.5/site-packages/rope-0.2pre5-py2.5.egg', '/usr/lib/python2.5/site-packages/RescueTimeUploader-0.0.0-py2.5.egg', '/usr/lib/python2.5', '/usr/lib/python2.5/plat-linux2', '/usr/lib/python2.5/lib-tk', '/usr/lib/python2.5/lib-dynload', '/usr/local/lib/python2.5/site-packages', '/usr/lib/python2.5/site-packages', '/usr/lib/python2.5/site-packages/Numeric', '/usr/lib/python2.5/site-packages/PIL', '/usr/lib/python2.5/site-packages/gst-0.10', '/var/lib/python-support/python2.5', '/usr/lib/python2.5/site-packages/gtk-2.0', '/var/lib/python-support/python2.5/gtk-2.0', '/usr/lib/site-python'] print sys.version 2.6+ (r26:66714, Oct 22 2008, 09:25:02) [GCC 4.3.2] sys.stderr.write("I am on stderr\n") I am on stderr
The os module has functions and methods which provide cross platform access to operating system details. A few of the methods are described below
Attribute | Description |
---|---|
os.chdir() | Change working directory |
os.getlogin() | Get login name of user who owns the current controlling terminal |
os.getpid() | Get current process id |
os.environ | A dictionary containing the environment variables |
os.access() | Used to check if the current process can access a path |
os.chroot() | Issue a chroot system call |
os.stat() | Perform a stat on a path |
os.symlink() | Creates a symlink to a file |
os.execl() | Exec a program replacing the current process |
os.spawnl() | Spawn a program in another process |
os.system() | Execute system command (deprecated by the subprocess module) |
The functions along with the subprocess
module are commonly used
for glue applications.
import os os.getlogin() 'noufal' os.getcwd() '/home/noufal/notes' os.chroot("/etc") Traceback (most recent call last): File "<stdin>", line 1, in <module> OSError: [Errno 1] Operation not permitted: '/etc' os.stat("/etc/passwd") posix.stat_result(st_mode=33188, st_ino=609499L, st_dev=2058L, st_nlink=1, st_uid=0, st_gid=0, st_size=1921L, st_atime=1241026999, st_mtime=1241026835, st_ctime=1241026835) os.system("ls /boot") abi-2.6.27-11-generic config-2.6.27-7-generic initrd.img-2.6.27-9-generic System.map-2.6.27-9-generic vmlinuz-2.6.27-14-generic abi-2.6.27-14-generic config-2.6.27-9-generic lost+found vmcoreinfo-2.6.27-11-generic vmlinuz-2.6.27-7-generic abi-2.6.27-7-generic grub memtest86+.bin vmcoreinfo-2.6.27-14-generic vmlinuz-2.6.27-9-generic abi-2.6.27-9-generic initrd.img-2.6.27-11-generic System.map-2.6.27-11-generic vmcoreinfo-2.6.27-7-generic config-2.6.27-11-generic initrd.img-2.6.27-14-generic System.map-2.6.27-14-generic vmcoreinfo-2.6.27-9-generic config-2.6.27-14-generic initrd.img-2.6.27-7-generic System.map-2.6.27-7-generic vmlinuz-2.6.27-11-generic 0 os.environ['LOGNAME'] 'noufal'
The operator module exposes Python operators as functions. So, a statement like
x,y = 5,10 print x < y True
could be written as
import operator x,y = 5,10 operator.lt(x,y) # operator.lt is the less than operator True
This is useful to construct conditions on the fly at runtime.
The re
module provides regular expression support which we can
use to parse textual data a'la Perl.
The module mainly makes available the compile
function which can
be used to create compiled regular expressions.
An example is shown below.
scan_text = """Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore-- While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. """ rexp = re.compile(r".*,\s*(\S+)\s+(\S+)\s*,.*") # rexp is a regular expression object now. # Tries to find two words bracketed by commas. s = rexp.search(scan_text) # s is a search result print s.groups() # Prints the groups matched by the brackets in the regexp ('nearly', 'napping')
Python's datetime module allows us to handle dates and times in the Python object space without doing string parsing. A simple example where create a datetime and advance it by 2 weeks.
import datetime now = datetime.datetime.now() two_weeks = datetime.timedelta(weeks=2) # A timedelta is used to calculate temporal distance. t = now + two_weeks print t.strftime("%d %B %Y") 17 June 2009 #Your output will vary on when you run this program.
The logging module allows us to create a robust system where information about the program is logged during it's runtime.
import logging logging.basicConfig(stream = sys.stdout, format = "%(levelname)s : %(asctime)s : %(message)s", level = logging.WARNING) logging.debug("Hello?") # Doesn't appear since we fixed it to WARNING and above logging.info("Hello?") # Doesn't appear since we fixed it to WARNING and above logging.warning("Hello?") WARNING : 2009-06-03 01:45:20,344 : Hello? logging.critical("Hello?") CRITICAL : 2009-06-03 01:45:23,090 : Hello?
This example oversimplifies the module. In reality, we'd make a logger that has multiple handlers (eg. Debug only to file, warnings and above to a separate file. info and above to screen and file etc.)
Refer the complete docs for details.
Unittest is a module that allows us to write tests for our program. It's based on the Xunit framework designed by Kent Beck.
Let us assume we have the program shown below
# average.py class AverageError(Exception): pass def average(*nos): if not nos: raise AverageError("Nothing to average") if not all([isinstance(x,int) or isinstance(x,float) for x in nos]): raise AverageError("Can only average ints or floats") return float(sum(nos,0))/len(nos)
It returns the average a list of numbers we give it and has some basic error checking capability
The tests for the module can be written as below
# average_tests.py import unittest import average class AverageTest(unittest.TestCase): def testNullInput(self): "Tests if the function raises AverageError on no inputs" self.assertRaises(average.AverageError, average.average)n def testStringInput(self): "Tests if the function raises AverageError on bad (String) inputs" self.assertRaises(average.AverageError, average.average, "hello", 1, 2, 3) def testAverageComputation(self): "Tests if the computations are correct" avg_from_lib = average.average(2,3,4,5) true_average = float(2 + 3 + 4 + 5 )/ 4 self.assertEqual(avg_from_lib, true_average) if __name__ == "__main__": unittest.main()
We save these two into the same directory and run the tests like so to see this
%python average_tests.py -v Tests if the computations are correct ... ok Tests if the function raises AverageError on no inputs ... ok Tests if the function raises AverageError on bad (String) inputs ... ok ---------------------------------------------------------------------- Ran 3 tests in 0.002s OK
A test case by definition is a class that has been derived from
unittest.TestCase
. In it, we can define methods that start with
the letters test
. Each one will be executed and then one of the
validator methods (methods that check for a condition or lack of
it) should be called. As you can see, the function behaved as we
expected so the results are pass. If we corrupt something
(eg. let's drop the if not nos
condition), we'd get an error and
the test would fail like so. The -v
flag makes it print out
verbose details.
% python average_tests.py -v Tests if the computations are correct ... ok Tests if the function raises AverageError on no inputs ... ERROR Tests if the function raises AverageError on bad (String) inputs ... ok ====================================================================== ERROR: Tests if the function raises AverageError on no inputs ---------------------------------------------------------------------- Traceback (most recent call last): File "average_tests.py", line 9, in testNullInput self.assertRaises(average.AverageError, average.average) File "/usr/lib/python2.5/unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "/tmp/average.py", line 8, in average return float(sum(nos,0))/len(nos) ZeroDivisionError: float division ---------------------------------------------------------------------- Ran 3 tests in 0.014s
Itertools offers us a range of generator constructors. These are useful to construct interesting generators based on existing iterables.
Here are some simple examples. To see more powerful examples, please visit the official documentation pages.
itertools.cycle
allows us to create an infinitely looping
iterator from a finite list.
c = [1,2,3]
import itertools
c_cycle = itertools.cycle(c)
c_cycle.next()
1
c_cycle.next()
2
c_cycle.next()
3
c_cycle.next()
1
c_cycle.next()
2
c_cycle.next()
3
c_cycle.next()
1
itertools.chain
allows us to tie together multiple iterators into
a single one. Here's an example of flattening a list of lists using
it.
c= [[1,2,3],[4,5,6],[7,8,9]]
list(itertools.chain(*c))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Since these are generators, they are evaluated lazily and usually a good choice when you have to loop over things.
Please refer to the standard documentation for more useful examples.
The following documents are worth reading
The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!
Noufal Ibrahim is a Python enthusiast and a small contributor to the language and to a couple of other Python projects. He's enthusiastic about spreading the language in the educational and corporate sectors.
He keeps a personal site at nibrahim.net.in and is usually found hanging out on IRC on the freenode network as Khmar.
Many of the examples have been borrowed from the excellent Python tutorial by Anand Chitipothu.
It has been generated using the other worldly publishing system (and other things) for Emacs called Org more.
Date: 2009-06-03 14:47:27 IST
HTML generated by org-mode 6.21b in emacs 23