• Non ci sono risultati.

Binary Search Trees

7.1 Creating Binary Search Trees

We now know how to do efficient lookup in a binary search tree by exploiting the ordering of the labels and the invariants of the tree. How do we go about obtaining such a tree?

One possibility would be to simply check whether a given tree satisfies the binary search tree invariant. We can code up the invariant as a boolean valued-function that determines whether a given tree is a binary search tree or not.

To do so, we first create two helper functions that can determine whether all of the nodes of a tree are less than (or greater than) a particularintvalue:

1A complete, balanced binary tree of height h has 2h − 1 nodes, or, conversely if there are n nodes, then the tree height is log2n.

2The assumption that the tree is nearly full and balanced is a big one—to ensure that this is the case more sophisticated techniques (e.g. red–black trees) are needed. See CIS 121 if you’re interested in such datastructures.

(* helper functions for writing is_bst *)

(* (tree_less t n) is true when all nodes of t are strictly less than n

*)

let rec tree_less (t:tree) (n:int) : bool = begin match t with

| Empty -> true

| Node(lt, x, rt) ->

x < n && (tree_less lt n) && (tree_less rt n) end

(* (tree_gtr t n) is true when all nodes of t are strictly greater than n

*)

let rec tree_gtr (t:tree) (n:int) : bool = begin match t with

| Empty -> true

| Node(lt, x, rt) ->

x > n && (tree_gtr lt n) && (tree_gtr rt n) end

Theis_bstfunction uses these two helpers to directly encode the binary search tree invariant as a program:

(* determines whether t satisfies the bst invariant *) let rec is_bst (t:tree) : bool =

begin match t with

| Empty -> true

| Node(lt, x, rt) ->

is_bst lt && is_bst rt &&

(tree_less lt x) && (tree_gtr rt x) end

This solution isn’t very satisfactory, though. It is very unlikely that some tree we happen to obtain from somewhere actually satisfies the binary search tree in-variant. Moreover, checking that the tree satisfies the invariant is pretty expensive.

A better way to construct a binary search tree, is to start with a simple binary search tree like Empty, which trivially satisfies the invariant, and then insert or

deletenodes as desired to obtain a new binary search tree.

Each operation must preserve the binary search tree invariant: given a binary search tree t as input, insert t n should produce a new binary search tree that contains the same set of elements astbut additionally containsn. Ifthappens to already containn, then the resulting tree is justtitself.

How can we implement such an insertion function? The key idea is that insert-ing a new element is just like searchinsert-ing for it—if we happen to find the element

we’re trying to insert, then the result is just the input tree. On the other hand, if the input tree does not already contain the element we’re trying to insert, then the search will find a Empty tree, and we just need to replace that empty tree with a new leaf node containing the inserted element. In code:

(* ASSUMES: t is a binary search tree.

Inserts n into the binary search tree t, yielding a new binary search tree *) let rec insert (t:tree) (n:int) : tree =

begin match t with

| Empty ->

(* element not found, create a new leaf *) Node(Empty, n, Empty)

| Node(lt, x, rt) ->

if x = n then t

else if n < x then Node (insert lt n, x, rt) else Node(lt, x, insert rt n)

end

The code forinsertexactly mirrors that oflookup, except that it returns a tree at each stage rather than simply searching for the element. We have to check that the resulting tree maintains the binary search tree invariant, but this is easy to see, since we only ever insert a nodento the left of a nodexwhennis strictly less than

x. (And similarly for insertion into the right subtree.)

Deletion is more complex, because there are several cases to consider. If the node we are trying to delete is not already in the tree, then delete can simply return the original tree. On the other hand, if the node is in the tree, there are three possibilities. First: the node to be deleted is a leaf. In that case, we simply remove the leaf node by replacing it withEmpty. Second, the node to be deleted has exactly one child. This case too is easy to handle: we just replace the deleted node with its child tree. The last case is when the node to be deleted has two non-empty subtrees. The question is how delete the node while still maintaining the binary search tree invariant.

This requires a bit of cleverness. Observe that the left subtree must be non-empty, so it by definition contains a maximal element, call itm, that is still strictly less than n, the node to be deleted. Note also that m is strictly less than all of the nodes in the right subtree of n. Both of these properties follow from the binary search tree invariant. We can therefore promotemto replacen in the resulting tree, but we have to also removem(which is guaranteed to be a leaf) at the same time.

Putting all of these observations together gives us the following code:

(* returns the maximum integer in a NONEMPTY bst t *) let rec tree_max (t:tree) : int =

begin match t with

| Empty -> failwith "tree_max called on empty tree"

| Node(_,x,Empty) -> x

| Node(_,_,rt) -> tree_max rt end

(* returns a binary search tree that has the same set of nodes as t except with n removed (if it's there) *) let rec delete (n:int) (t:tree) : tree =

begin match t with

| Empty -> Empty

| Node(lt,x,rt) ->

if x = n then

begin match (lt,rt) with

| (Empty, Empty) -> Empty

| (Node _, Empty) -> lt

| (Empty, Node _) -> rt

| _ -> let m = tree_max lt in Node(delete m lt, m, rt) end

else

if n < x then Node(delete n lt, x, rt) else Node(lt, x, delete n rt)

end