Let us consider a list of real numbers and you want to find an element in the list. Suppose that the numbers are all different. If you iterate over the elements to find a particular one then it takes as much steps as the length of the list. This is what we call
time complexity.
But you can do better if the list is already sorted. The advantage of the binary search is that it takes $O(\log(n))$ time but you have to sort your numbers first.
The best performance is when you sort your numbers and don't modify (insert or erase) them later. In this case you can search in them efficiently.
Binary search
import sys
def binary_search(l, x):
a = 0
b = len(l)-1
while a <= b:
i = (a + b)/2
print >>sys.stderr, i,
if l[i] < x:
a = i+1
elif l[i] > x:
b = i-1
else:
print >>sys.stderr
return i
print >>sys.stderr
return None
l = [3,10,-1,3.14,20,2]
l.sort()
print l
print binary_search(l,10)
This search takes at most $\log_2(n)$ steps, since the possible places get halved at every step.
for i in range(-1,11):
print binary_search(range(10),i)
In this algoritm the numbers have to be pre-sorted. You have to be careful when instering a new element.
Write a class for storing a sorted list. The class will keep the order of the elements by insert a new element to its correct place.
class OrderedList(object):
def __init__(self, elements=[]):
self.l = sorted(elements)
def search(self, x):
a = 0
b = len(self.l)-1
while a <= b:
i = (a + b)/2
if self.l[i] < x:
a = i+1
elif self.l[i] > x:
b = i-1
else:
return i
return None
def insert(self, x):
a = 0
b = len(self.l)-1
while a <= b:
i = (a + b)/2
if self.l[i] < x:
a = i+1
elif self.l[i] > x:
b = i-1
else:
return
self.l.insert(a, x)
def erase(self, x):
index = self.search(x)
if index:
del self.l[index]
def __iter__(self):
return iter(self.l)
def __repr__(self):
return str(self.l)
l = OrderedList([1, -1, 2])
print l
l.insert(5)
print l
l.erase(2)
print l
l.insert(3.14)
l.insert(0)
print l
print l.search(1)
A binary tree is a directed graph where every node have at most two outgoing edges.
We will implement a tree for storing numbers in a similar way than in the previous ordered list.
class Node(object):
def __init__(self, data):
self.data = data
self.left = None
self.right = None
def insert(self, data):
if self.data > data:
if self.left is None:
self.left = Node(data)
else:
self.left.insert(data)
elif self.data < data:
if self.right is None:
self.right = Node(data)
else:
self.right.insert(data)
This implementation keeps the tree sorted in a way that smaller numbers always go to the left and greater numbers to the right.
If a node has no outgoing edge then it is marked by None
.
If you want to insert an elements then do the following.
None
then insert a new node (leaf) there.insert
method to that node recursively.No we add a to_list()
method which retrieves the elements in order.
to_list
on the left branch (recursive)to_list
on the right branch (recursive)The result is the list of ordered elements.
class Node(object):
def __init__(self, data):
self.data = data
self.left = None
self.right = None
def insert(self, data):
if self.data > data:
if self.left is None:
self.left = Node(data)
else:
self.left.insert(data)
elif self.data < data:
if self.right is None:
self.right = Node(data)
else:
self.right.insert(data)
def to_list(self):
s = []
if self.left is not None:
s += self.left.to_list()
s.append(self.data)
if self.right is not None:
s += self.right.to_list()
return s
root = Node(5)
root.insert(4)
root.insert(1)
root.insert(7)
print root.to_list()
This is good for storing the elements ordered and also you can find and add elements with $O(\log(n))$ time complexity.
Although this technique is slow if the tree is very uneven (unbalanced). The actual time complexity is the depth (height) of the tree. Look this!
root = Node(5)
root.insert(4)
root.insert(3)
root.insert(2)
root.insert(1)
There are self-balancing binary trees (AVL trees) and you will learn about them on Algorithm theory class.
Now we implement the find(data)
method, it is quite straight forward.
class Node(object):
def __init__(self, data):
self.data = data
self.left = None
self.right = None
def insert(self, data):
if self.data > data:
if self.left is None:
self.left = Node(data)
else:
self.left.insert(data)
elif self.data < data:
if self.right is None:
self.right = Node(data)
else:
self.right.insert(data)
def to_list(self):
s = []
if self.left is not None:
s += self.left.to_list()
s.append(self.data)
if self.right is not None:
s += self.right.to_list()
return s
def find(self, data):
if data == self.data:
return self
elif data < self.data and self.left is not None:
return self.left.find(data)
elif self.right is not None:
return self.right.find(data)
else:
return None
root = Node(5)
root.insert(4)
root.insert(1)
root.insert(7)
print root.find(4)
print root.find(4).to_list()
A very useful application of binary trees are expression trees where the nodes are operations and the leaves are numbers.
If a node is an operation, then store the operation in self.data
and the two branches are the two operands.
The leaves are numbers without outgoing edges.
class Node:
pass
def calculate(node):
if node.data == "+":
return calculate(node.left) + calculate(node.right)
elif node.data == "*":
return calculate(node.left) * calculate(node.right)
else:
return node.data
def write(node):
if node.data == "+" or node.data == "*":
return "(" + write(node.left) + ")" + node.data + "(" + write(node.right) + ")"
else:
return str(node.data)
x0 = Node()
x0.data = "+"
x1 = Node()
x1.data = "*"
x2 = Node()
x2.data = 7
x3 = Node()
x3.data = 5
x1.left = x2
x1.right = x3
x4 = Node()
x4.data = -3
x0.left = x1
x0.right = x4
print calculate(x0)
print write(x0)