Data structures¶

Arrays¶

In Python the easiest way to implement a 2D array is a list of lists. For a 3D array: list of lists of lists and so on...

M = [[1, 2], [3, 4]]

You can access the emelents with multiple bracket [ ] indexing:

M[0][1]

2

M = [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[0, 9], [0, 0]]]  # 3x2x2 array
M[2][0][1]

9

Write a function that prints a 2D array in a tabular like format:

1   2
3   4

def array_print(M):
    for i in range(len(M)):
        for j in range(len(M[i])):
            print M[i][j], "\t",   # TAB character
        print

array_print([[1,2,3],[4,5,6],[7,8,9]])

1 	2 	3 	
4 	5 	6 	
7 	8 	9

array_print([[1,2,3, 3.5],[4,5,6, 6.5],[7,8,9, 9.5]])

1 	2 	3 	3.5 	
4 	5 	6 	6.5 	
7 	8 	9 	9.5

Embedded data structures¶

Let's say that a store would like to set up a discount system for regular customers. The store's database has the name of the customers (a unique string) and their shopping costs so far. The customers are stored in a list, every element of the list is a pairs: their name and their list of shopping costs. For example one customer entry would look like this:

["Anna", [54, 23, 12, 56, 12, 71]]

The discounts are calculated in the following way:

Total shoppings > 200: $10\%$

Total shoppings > 500: $15\%$

Total shoppings > 1000: $20\%$

Otherwise no discount ($0\%$).

Write a function that calculates the discount for every customer.

The input is the list of customer entries
The output is the list of the discounts in a length 2 list: name and the discount
For example a list of these: ["Anna", 10]

How to begin? Break down to subtasks!

Decide the discount from the total shopping cost ( 228 -> 10 )
calculate the total shopping cost (sum the list of costs)
Do this for every customer and return the results in a result list

There are two ways to achive this (design pattern):

top-down: first write the final function assuming that the smaller subtask (functions) are already done, then do the smaller tasks
bottom-up: write the smaller subtasks first, then work your way up to the final task

# top-down

def discount(customers):
    result = []
    for customer in customers:
        result.append( calculate_discount(customer) )
    return result


def calculate_discount(customer):
    name = customer[0]
    total_cost = 0
    for shopping in customer[1]:
        total_cost += shopping
    return [name, discount_from_total(total_cost)]


def discount_from_total(total):
    if total > 1000:
        return 20
    if total > 500:
        return 15
    if total > 200:
        return 10
    return 0

discount([["Anna", [54, 23, 12, 56, 12, 71]], 
            ["Bill", [11, 3, 12, 1, 12, 55]],
            ["Hagrid", [111, 545, 343, 56, 12, 66]], 
            ["Not_a_wizard", [54, 222, 65, 56, 43, 71]]])

[['Anna', 10], ['Bill', 0], ['Hagrid', 20], ['Not_a_wizard', 15]]

tuple¶

We have already seen strings and lists which are similar to the tuple.

An n-tuple can be created with a comma separeted list in a parenthesis or with the tuple() function:

t = (1, 5, 6, 2, 1)
print t[2]
type(t)

6

tuple

l = [1, 2, 3]
t = tuple(l)
print t

(1, 2, 3)

for loop as before

for e in t:
    print e

1
2
3

But you cannot assign individual elements (like the string)

t[1] = 4

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-141c76cb54a2> in <module>()
----> 1 t[1] = 4

TypeError: 'tuple' object does not support item assignment

Parenthesis is also used for grouping operations ( $2*(3+4)$ ) but that is different from the parenthesis of the tuple. Also in some cases you don't have to write the parenthesis at all.

x = 2, 3, 4
print x

(2, 3, 4)

x, y = 2, 3
print x
print y

2
3

1-tuples are different than a single object in a parenthesis, you can construct a 1-tuple with an ending comma inside a parenthesis.

print type((1))
print type((1,))

<type 'int'>
<type 'tuple'>

Mutable and immutable types¶

The tuple is almost identical to the list except that it is immutable: you cannot assign a single element. The list is mutable.

You cannot change a tuple once created, except creating a new one, like in case of strings:

s = ("h", "e", "l", "l", "o")
print s[2]
s = ("h", "a", "l", "l", "o")

string = "puppy"
string2 = string[:2] + "ff" + string[4:]
print string2

l
puffy

for e in s:
    print e,

h a l l o

s[1] = "e"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-73feb1f8575e> in <module>()
----> 1 s[1] = "e"

TypeError: 'tuple' object does not support item assignment

Dictionary¶

A dictionary is a series of pairs, we call them key-value pairs. A dictionary can have any type of key and any type of value, but the key have to be immutable.

You can contruct dictionaries with a curly bracket { } or with the dict() function.

d = {"puppy": 5}
type(d)

dict

A value can be anything, mutable and immutable.

d = {}
d["cat"] = [1, 5]

You can add a pair to the dictionary by assigning the value to the bracketed key.

d["new key"] = "new value"
print d

{'new key': 'new value', 'cat': [1, 5]}

You can access the elements by their keys in a bracket.

d["cat"]

[1, 5]

You can use mixed keys with different types (as long as immutable):

d = dict()
d[5] = 7
d[(1, 5)] = "puppy"
d[(1, 5)] = "snake"
d["cat"] = 3.14
print d

{(1, 5): 'snake', 5: 7, 'cat': 3.14}

A for loop iterates over the keys:

for key in d:
    print key, ":", d[key]

(1, 5) : snake
5 : 7
cat : 3.14

The order of the pairs are not strinct, they may seem random. This is because of the storing algorithm of the dict.

The dictionary uses a so-called hash function to calculate where to put the elements. Their actual order is not relevant, it can even change during your program. But the key-value pairs are kept togeather.

You can observe the hash values yourself with the hash() function:

print hash((1, 5))
print hash(5), hash(0), hash(False), hash(True)
print hash((5,))
print hash("puppy")
print hash("puppz")
print hash("muppy")
print hash("puppy, cat, python, galahad ...")

3713081631939823281
5 0 0 1
3430023387570
1778334363375243195
1778334363375243192
1523879377499237882
5497814539177767733

Hash function¶

The dictionary uses the hash function, but a similar function is common in both theoretical and applied computer science. There are advanced algorithm courses about hash functions.

What is a hash function:

assign a natural number to every piece of data (the value has a fixed width bit representation: 64, 128, 256 ... )
- it is a function so it assigns the same value to the same thing (every time).
Sort-of unique meaning that there are a few different data with the same hash value
it should be calculated quickly

There may be extra requirements in other applications (cryptographic hash function):

infeasable to invert (find out the data from the hash), not impossible but rather slow
(pseudo)random and non-continuous
infeasable to mimic a given data (find a piece of data to match a given hash value)

Applications¶

checksum (file hash)
indexing, fast access (dict)
pseudo random number generators (cryptography)

Iterating and functions of containers¶

A data type is iterable if it can be iterated over with a for loop. Example:

list
string (characters)
tuple
dict

for x in """iterable""": """do stuff"""

There are some useful functions that can be used on any iterables:

# repetition
print "puppy "*3
print (1, 2, 3)*3
print [1, 2, 3]*3

puppy puppy puppy 
(1, 2, 3, 1, 2, 3, 1, 2, 3)
[1, 2, 3, 1, 2, 3, 1, 2, 3]

# concatenation (except for dict)
(1, 2) + (2, 4, 6)

(1, 2, 2, 4, 6)

# logical "and"
print all((False, True, True, True))
print all((0, 1, 1, 1))

# logical "or"
any((0, 1, 1, 1))

True

# "transpose"
zip([1, 2, 3], [11, 12, 13])

[(1, 11), (2, 12), (3, 13)]

# sum (for numbers, not for strings)
sum((1, 2, 3))

6