Python & Perl
Lecture 06

Department of Computer Science
Utah State University
Outline
●

●

Data Abstraction: Building Huffman Trees with Lists
and Tuples
List Comprehension
Data Abstraction
Building Huffman Trees with Lists and
Tuples
Background
●

●

●

In information theory, coding refers to methods that
represent data in terms of bit sequences (sequenc...
Example: Standard ASCII & Unicode
●

Standard ASCII encodes each character as a 7-bit sequence

●

Using 7 bits allows us ...
Two Types of Codes
●

●

●

There are two types of codes: fixed-length and variable-length
Fixed-length (e.g., ASCII, Unic...
Example: Fixed-Length Code
●

A – 000

C – 010

E – 100

G – 110

●

B – 001

D – 011

F – 101

H – 111

●

AADF = 0000000...
Example: Variable-Length Code
●

A–0

C – 1010

●

B – 100

●

AADF = 0010111101

●

The encoding of AADF is 10 bits

D – ...
End of Character in Variable-Length Code
●

●

●

One of the challenges in variable-length codes is knowing
where one char...
Huffman Code
●

●

●

●

Huffman code is a variable-length code that takes advantage of relative frequencies of characters...
Huffman Tree Example
{A, B, C, D, E, F, G, H}: 17
1

0
A: 8

{B, C, D, E, F, G, H}: 9
1

0

{E, F, G, H}: 4

{B, C, D}: 5
...
Using Huffman Tree to Encode/Decode
Characters
●

The tree on the previous slide, these are the encodings:


A is encoded...
Building The Huffman Tree
Simple Huffman Tree

{A, B, D, C}: 8
{B, D, C}: 4
A: 4
{D, C}: 2

B: 2
D: 1

C: 1
Constructing Leaves
### a leaf is a tuple whose first element is symbol
### represented as a string and whose second eleme...
Constructing Leaves
### return the character (symbol) of the leaf
def get_leaf_symbol(leaf):
return leaf[0]
### return the...
Constructing Huffman Trees
### A Non-Leaf node (internal node) is represented as
### a list of four elements:
### 1. left ...
Accessing Huffman Trees
def get_leaf_symbol(leaf):
return leaf[0]
def get_leaf_freq(leaf):
return leaf[1]
def get_left_bra...
Accessing Huffman Trees
def get_symbols(huff_tree):
if is_leaf(huff_tree):
return [get_leaf_symbol(huff_tree)]
else:
retur...
Constructing Huffman Trees
### A Huffman tree is constructed from its left branch, which can
### be a huffman tree or a le...
MAKE_HUFFMAN_TREE Example
ht01 = make_huffman_tree(make_leaf('A', 4),
make_huffman_tree(make_leaf('B', 2),
make_huffman_tr...
MAKE_HUFFMAN_TREE Example
Python data structure that represents the Huffman tree below:
[('A', 4),
[('B', 2), [('D', 1), (...
Customizing sort()
def leaf_freq_comp(leaf1, leaf2):
return cmp(get_leaf_freq(leaf1),
get_leaf_freq(leaf2))

huff_leaves =...
Customizing sort()
def leaf_symbol_comp(leaf1, leaf2):
return cmp(get_leaf_symbol(leaf1),
get_leaf_symbol(leaf2))

huff_le...
Encoding & Decoding Messages with
Huffman Trees
Sample Huffman Tree
{A, B, C, D, E, F, G, H}: 17
1

0

{B, C, D, E, F, G, H}: 9

A: 8

1

0

{E, F, G, H}: 4

{B, C, D}: 5...
Symbol Encoding
1. Given a symbol s and a Huffman tree ht, set current_node to the root
node and encoding to an empty list...
Example
●

Encode B with the sample Huffman tree

●

Set current_node to the root node

●

●

●

●

B is in current_node's...
Message Encoding
●

●

●

Given a sequence of symbols message and a Huffman
tree ht
Concatenate the encoding of each symbo...
Example
●

Encode ABBA with the sample Huffman tree

●

Encoding for A is 0

●

Encoding for B is 100

●

Encoding for B i...
Message Decoding
1. Given a sequence of bits message and a Huffman tree ht, set current_node to
the root and decoding to a...
Example
●

●

Decode 0100 with the sample Huffman tree
Read 0, go left to A:8 & add A to decoding and reset
current_node t...
List Comprehension
List Comprehension
●

●

List comprehension is an syntactic construct in some
programming languages for building lists fro...
Set-Former Notation Example

4  x | x  N , x



 100
 4  x is the output function
 x is the variable
 N is the in...
Set-Former Notation Examples

x  a, b | x  3is the set of all strings over a, b
*

whose length is 0, 1, 2, or 3.
...
For-Loop Implementation
### building the list of the set-former example with forloop
>>> rslt = []
>>> for x in xrange(201...
List Comprehension Equivalent
### building the same list with list comprehension
>>> s = [ 4 * x for x in xrange(201) if x...
For-Loop
### building list of squares of even numbers in [0, 10]
### with for-loop
>>> rslt = []
>>> for x in xrange(11):
...
List Comprehension Equivalent
### building the same list with list comprehension
>>> [x ** 2 for x in xrange(11) if x % 2 ...
For-Loop
## building list of squares of odd numbers in [0,
10]
>>> rslt = []
>>> for x in xrange(11):
if x % 2 != 0:
rslt....
List Comprehension Equivalent
## building list of squares of odd numbers [0, 10]
## with list comprehension
>>> [x ** 2 fo...
List Comprehension with For-Loops
For-Loop
>>> rslt = []
>>> for x in xrange(6):
if x % 2 == 0:
for y in xrange(6):
if y % 2 != 0:
rslt.append((x, y))
>>> r...
List Comprehension Equivalent
>>> [(x, y) for x in xrange(6) if x % 2 == 0 
for y in xrange(6) if y % 2 != 0]
[(0, 1), (0,...
List Comprehension with Matrices
List Comprehension with Matrices
●

List comprehension can be used to scan rows and columns in matrices
>>> matrix = [
[10...
List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 0
>>> [r[0] ...
List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 1
>>> [r[1] ...
List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 2
>>> [r[2] ...
List Comprehension with Matrices
### turn matrix columns into rows
>>> rslt = []
>>> for c in xrange(len(matrix)):
rslt.ap...
List Comprehension with Matrices
●

List comprehension can work with iterables (e.g., dictionaries)
>>> dict = {'a' : 'A',...
List Comprehension
●

If the expression inside [ ] is a tuple, parentheses are a must
>>> cubes = [(x, x**3) for x in xran...
List Comprehension
●

for-clauses in list comprehensions can iterate over
any sequences:
>>> rslt = [ c * n for c in 'math...
List Comprehension & Loop Variables
●

The loop variables used in the list comprehension for-loops
(and in regular for-loo...
When To Use List Comprehension
●

For-loops are easier to understand and debug

●

List comprehensions may be harder to un...
Reading & References
●

www.python.org

●

http://docs.python.org/library/stdtypes.html#typesseq

●

doc.python.org/howto/...
Upcoming SlideShare
Loading in …5
×

Python lecture 06

283 views

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
283
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Python lecture 06

  1. 1. Python & Perl Lecture 06 Department of Computer Science Utah State University
  2. 2. Outline ● ● Data Abstraction: Building Huffman Trees with Lists and Tuples List Comprehension
  3. 3. Data Abstraction Building Huffman Trees with Lists and Tuples
  4. 4. Background ● ● ● In information theory, coding refers to methods that represent data in terms of bit sequences (sequences of 0's and 1's) Encoding is a method of taking data structures and mapping them to bit sequences Decoding is a method of taking bit sequences and outputting the corresponding data structure
  5. 5. Example: Standard ASCII & Unicode ● Standard ASCII encodes each character as a 7-bit sequence ● Using 7 bits allows us to encode 27 possible characters ● ● ● Unicode has three standards: UTF-8 (uses 8-bit sequences), UTF-16 (uses 16-bit sequences), and UTF-32 (uses 32-bit sequences) UTF stands for Unicode Transformation Format Python 2.X's Unicode support: “Python represents Unicode strings as either 16- or 32-bit integers), depending on how the Python interpreter was compiled.”
  6. 6. Two Types of Codes ● ● ● There are two types of codes: fixed-length and variable-length Fixed-length (e.g., ASCII, Unicode) codes encode every character in terms of the same number of bits Variable-length codes (e.g., Morse, Huffman) encode characters in terms of variable numbers of bits: more frequent symbols are encoded with fewer bits
  7. 7. Example: Fixed-Length Code ● A – 000 C – 010 E – 100 G – 110 ● B – 001 D – 011 F – 101 H – 111 ● AADF = 000000011101 ● The encoding of AADF is 12 bits
  8. 8. Example: Variable-Length Code ● A–0 C – 1010 ● B – 100 ● AADF = 0010111101 ● The encoding of AADF is 10 bits D – 1011 E – 1100 F – 1101 G – 1110 H – 1111
  9. 9. End of Character in Variable-Length Code ● ● ● One of the challenges in variable-length codes is knowing where one character ends and the one begins Morse uses a special character (separator code) Prefix coding is another solution: the prefix of every character is unique – no code of any character starts another character
  10. 10. Huffman Code ● ● ● ● Huffman code is a variable-length code that takes advantage of relative frequencies of characters Huffman code is named after David Huffman, the researcher who discovered it Huffman code is represented as a binary tree where leaves are individual characters and their frequencies Each non-leaf node is a set of characters in all of its subnodes and the sum of their relative frequencies
  11. 11. Huffman Tree Example {A, B, C, D, E, F, G, H}: 17 1 0 A: 8 {B, C, D, E, F, G, H}: 9 1 0 {E, F, G, H}: 4 {B, C, D}: 5 1 0 {C, D}: 2 B: 3 0 C: 1 1 0 1 D: 1 {G, H}: 2 {E, F}: 2 0 E: 1 1 F: 1 0 G: 1 1 H: 1
  12. 12. Using Huffman Tree to Encode/Decode Characters ● The tree on the previous slide, these are the encodings:  A is encoded as 0  B is encoded as 100  C is encoded as 1010  D is encoded as 1011  E is encoded as 1100  F is encoded as 1101  G is encoded as 1110  H is encoded as 1111
  13. 13. Building The Huffman Tree
  14. 14. Simple Huffman Tree {A, B, D, C}: 8 {B, D, C}: 4 A: 4 {D, C}: 2 B: 2 D: 1 C: 1
  15. 15. Constructing Leaves ### a leaf is a tuple whose first element is symbol ### represented as a string and whose second element is ### the symbol's frequency def make_leaf(symbol, freq): return (symbol, freq) def is_leaf(x): return isinstance(x, tuple) and len(x) == 2 and isinstance(x[0], str) and isinstance(x[1], int)
  16. 16. Constructing Leaves ### return the character (symbol) of the leaf def get_leaf_symbol(leaf): return leaf[0] ### return the frequency of the leaf's character def get_leaf_freq(leaf): return leaf[1]
  17. 17. Constructing Huffman Trees ### A Non-Leaf node (internal node) is represented as ### a list of four elements: ### 1. left brach ### 2. right branch ### 3. list of symbols ### 4. combined frequency of symbols [left_branch, right_branch, symbols, frequency]
  18. 18. Accessing Huffman Trees def get_leaf_symbol(leaf): return leaf[0] def get_leaf_freq(leaf): return leaf[1] def get_left_branch(huff_tree): return huff_tree[0] def get_right_branch(huff_tree): return huff_tree[1]
  19. 19. Accessing Huffman Trees def get_symbols(huff_tree): if is_leaf(huff_tree): return [get_leaf_symbol(huff_tree)] else: return huff_tree[2] def get_freq(huff_tree): if is_leaf(huff_tree): return get_leaf_freq(huff_tree) else: return huff_tree[3]
  20. 20. Constructing Huffman Trees ### A Huffman tree is constructed from its left branch, which can ### be a huffman tree or a leaf, and its right branch, another ### huffman tree or a leaf. The new tree has the symbols of the ### left branch and the right branch and the frequency of the left ### branch and the right branch def make_huffman_tree(left_branch, right_branch): return [left_branch, right_branch, get_symbols(left_branch) + get_symbols(right_branch), get_freq(left_branch) + get_freq(right_branch)]
  21. 21. MAKE_HUFFMAN_TREE Example ht01 = make_huffman_tree(make_leaf('A', 4), make_huffman_tree(make_leaf('B', 2), make_huffman_tree(make_leaf('D', 1), make_leaf('C', 1)))) {A, B, D, C}: 8 {B, D, C}: 4 A: 4 {D, C}: 2 B: 2 D: 1 C: 1
  22. 22. MAKE_HUFFMAN_TREE Example Python data structure that represents the Huffman tree below: [('A', 4), [('B', 2), [('D', 1), ('C', 1), ['D', 'C'], 2], ['B', 'D', 'C'], 4], ['A', 'B', 'D', 'C'], 8] {A, B, D, C}: 8 {B, D, C}: 4 A: 4 {D, C}: 2 B: 2 D: 1 C: 1
  23. 23. Customizing sort() def leaf_freq_comp(leaf1, leaf2): return cmp(get_leaf_freq(leaf1), get_leaf_freq(leaf2)) huff_leaves = [make_leaf('A', 8), make_leaf('C', 1), make_leaf('B', 3), make_leaf('D', 1), make_leaf('F', 1), make_leaf('E', 1), make_leaf('H', 1), make_leaf('G', 1)] print huff_leaves huff_leaves.sort(leaf_freq_comp) OUTPUT: [('A', 8), ('C', 1), ('B', 3), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1)] [('C', 1), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1), ('B', 3), ('A', 8)]
  24. 24. Customizing sort() def leaf_symbol_comp(leaf1, leaf2): return cmp(get_leaf_symbol(leaf1), get_leaf_symbol(leaf2)) huff_leaves2 = [make_leaf('A', 8), make_leaf('C', 1), make_leaf('B', 3), make_leaf('D', 1), make_leaf('F', 1), make_leaf('E', 1), make_leaf('H', 1), make_leaf('G', 1)] print huff_leaves2 huff_leaves2.sort(leaf_symbol_comp) print huff_leaves2 OUTPUT: [('A', 8), ('C', 1), ('B', 3), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1)] [('A', 8), ('B', 3), ('C', 1), ('D', 1), ('E', 1), ('F', 1), ('G', 1), ('H', 1)]
  25. 25. Encoding & Decoding Messages with Huffman Trees
  26. 26. Sample Huffman Tree {A, B, C, D, E, F, G, H}: 17 1 0 {B, C, D, E, F, G, H}: 9 A: 8 1 0 {E, F, G, H}: 4 {B, C, D}: 5 1 0 {C, D}: 2 B: 3 0 C: 1 1 0 1 D: 1 {G, H}: 2 {E, F}: 2 0 E: 1 1 F: 1 0 G: 1 1 H: 1
  27. 27. Symbol Encoding 1. Given a symbol s and a Huffman tree ht, set current_node to the root node and encoding to an empty list (you can also check if s is in the root node's symbol leaf and, if not, signal error) 2. If current_node is a leaf, return encoding 3. Check if s is in current_node's left branch or right branch 4. If in the left, add 0 to encoding, set current_node to the root of the left branch, and go to step 2 5. If in the right, add 1 to encoding, set current_node to the root of the right branch, and go to step 2 6. If in neither branch, signal error
  28. 28. Example ● Encode B with the sample Huffman tree ● Set current_node to the root node ● ● ● ● B is in current_node's the right branch, so add 1 to encoding & recurse into the right branch (current_node is set to the root of the right branch – {B, C, D, E, F, G, H}: 9) B is in current_node's left branch, so add 0 to encoding and recurse into the left branch (current_node is {B, C, D}: 5) B is in current_node's left branch, so add 0 to encoding & recurse into the left branch (current_node is B: 3) current_node is a leaf, so return 100 (value of encoding)
  29. 29. Message Encoding ● ● ● Given a sequence of symbols message and a Huffman tree ht Concatenate the encoding of each symbol in message from left to right Return the concatenation of encodings
  30. 30. Example ● Encode ABBA with the sample Huffman tree ● Encoding for A is 0 ● Encoding for B is 100 ● Encoding for B is 100 ● Encoding for A is 0 ● Concatenation of encodings is 01001000
  31. 31. Message Decoding 1. Given a sequence of bits message and a Huffman tree ht, set current_node to the root and decoding to an empty list 2. If current_node is a leaf, add its symbol to decoding and set current_node to ht's root 3. If current_node is ht's root and message has no more bits, return decoding 4. If no more bits in message & current_node is not a leaf, signal error 5. If message's current bit is 0, set current_node to its left child, read the bit, & go to step 2 6. If message's current bit is 1, set current_node to its right child, read the bit, & go to step 2
  32. 32. Example ● ● Decode 0100 with the sample Huffman tree Read 0, go left to A:8 & add A to decoding and reset current_node to the root ● Read 1, go right to {B, C, D, E, F, G, H}: 9 ● Read 0, go left to {B, C, D}:5 ● Read 0, go left to B:3 ● Add B to decoding & reset current_node to the root ● No more bits & current_node is the root, so return AB
  33. 33. List Comprehension
  34. 34. List Comprehension ● ● List comprehension is an syntactic construct in some programming languages for building lists from list specifications List comprehension derives its conceptual roots from the set-former (set-builder) notation in mathematics [Y for X in LIST] ● List comprehension is available in other programming languages such as Common Lisp, Haskell, and Ocaml
  35. 35. Set-Former Notation Example 4  x | x  N , x   100  4  x is the output function  x is the variable  N is the input set 2  x  100 is the predicate 2
  36. 36. Set-Former Notation Examples x  a, b | x  3is the set of all strings over a, b * whose length is 0, 1, 2, or 3. a b n n  | n  1 is the set of non - empty strings over a, b such that a ' s precede b' s and the number of a ' s is equal to the number of b' s. xy | x  a, b, y  aa, ccis the set of strings where a or b is followed by aa or cc.
  37. 37. For-Loop Implementation ### building the list of the set-former example with forloop >>> rslt = [] >>> for x in xrange(201): if x ** 2 < 100: rslt.append(4 * x) >>> rslt [0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
  38. 38. List Comprehension Equivalent ### building the same list with list comprehension >>> s = [ 4 * x for x in xrange(201) if x ** 2 < 100] >>> s [0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
  39. 39. For-Loop ### building list of squares of even numbers in [0, 10] ### with for-loop >>> rslt = [] >>> for x in xrange(11): if x % 2 == 0: rslt.append(x**2) >>> rslt [0, 4, 16, 36, 64, 100]
  40. 40. List Comprehension Equivalent ### building the same list with list comprehension >>> [x ** 2 for x in xrange(11) if x % 2 == 0] [0, 4, 16, 36, 64, 100]
  41. 41. For-Loop ## building list of squares of odd numbers in [0, 10] >>> rslt = [] >>> for x in xrange(11): if x % 2 != 0: rslt.append(x**2) >>> rslt [1, 9, 25, 49, 81]
  42. 42. List Comprehension Equivalent ## building list of squares of odd numbers [0, 10] ## with list comprehension >>> [x ** 2 for x in xrange(11) if x % 2 != 0] [1, 9, 25, 49, 81]
  43. 43. List Comprehension with For-Loops
  44. 44. For-Loop >>> rslt = [] >>> for x in xrange(6): if x % 2 == 0: for y in xrange(6): if y % 2 != 0: rslt.append((x, y)) >>> rslt [(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)]
  45. 45. List Comprehension Equivalent >>> [(x, y) for x in xrange(6) if x % 2 == 0 for y in xrange(6) if y % 2 != 0] [(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)]
  46. 46. List Comprehension with Matrices
  47. 47. List Comprehension with Matrices ● List comprehension can be used to scan rows and columns in matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract all rows >>> [r for r in matrix] [[10, 20, 30], [40, 50, 60], [70, 80, 90]]
  48. 48. List Comprehension with Matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract column 0 >>> [r[0] for r in matrix] [10, 40, 70]
  49. 49. List Comprehension with Matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract column 1 >>> [r[1] for r in matrix] [20, 50, 80]
  50. 50. List Comprehension with Matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract column 2 >>> [r[2] for r in matrix] [30, 60, 90]
  51. 51. List Comprehension with Matrices ### turn matrix columns into rows >>> rslt = [] >>> for c in xrange(len(matrix)): rslt.append([matrix[r][c] xrange(len(matrix))]) for >>> rslt [[10, 40, 70], [20, 50, 80], [30, 60, 90]] r in
  52. 52. List Comprehension with Matrices ● List comprehension can work with iterables (e.g., dictionaries) >>> dict = {'a' : 'A', 'bb' : 'BB', 'ccc' : 'CCC'} >>> [(item[0], item[1], len(item[0]+item[1])) for item in dict.items()] [('a', 'A', 2), ('ccc', 'CCC', 6), ('bb', 'BB', 4)]
  53. 53. List Comprehension ● If the expression inside [ ] is a tuple, parentheses are a must >>> cubes = [(x, x**3) for x in xrange(5)] >>> cubes [(0, 0), (1, 1), (2, 8), (3, 27), (4, 64)] ● Sequences can be unpacked in list comprehension >>> sums = [x + y for x, y in cubes] >>> sums [0, 2, 10, 30, 68]
  54. 54. List Comprehension ● for-clauses in list comprehensions can iterate over any sequences: >>> rslt = [ c * n for c in 'math' for n in (1, 2, 3)] >>> rslt ['m', 'mm', 'mmm', 'a', 'aa', 'aaa', 't', 'tt','ttt', 'h', 'hh', 'hhh']
  55. 55. List Comprehension & Loop Variables ● The loop variables used in the list comprehension for-loops (and in regular for-loops) stay after the execution. >>> for i in [1, 2, 3]: print i 1 2 3 >>> i + 4 7 >>> [j for j in xrange(10) if j % 2 == 0] [0, 2, 4, 6, 8] >>> j * 2 18
  56. 56. When To Use List Comprehension ● For-loops are easier to understand and debug ● List comprehensions may be harder to understand ● ● ● List comprehensions are faster than for-loops in the interpreter List comprehensions are worth using to speed up simpler tasks For-loops are worth using when logic gets complex
  57. 57. Reading & References ● www.python.org ● http://docs.python.org/library/stdtypes.html#typesseq ● doc.python.org/howto/unicode.html ● ● ● Ch 02, M. L. Hetland. Beginning Python From Novice to Professional, 2nd Ed., APRESS Ch 02, H. Abelson and G. Sussman. Structure and Interpretation of Computer Programs, MIT Press S. Roman, Coding and Information Theory, Springer-Verlag

×