# Creating Binary Search Trees

## Binary Search Trees

### 7.1 Creating Binary Search Trees

We now know how to do efficient lookup in a binary search tree by ex-ploiting the ordering of the labels and the invariants of the tree. How do we go about obtaining such a tree?

One possibility would be to simply check whether a given tree satisfies the binary search tree invariant. We can code up the invariant as a boolean valued-function that determines whether a given tree is a binary search tree or not.

To do so, we first create two helper functions that can determine whether all of the nodes of a tree are less than (or greater than) a particular

intvalue:

(* helper functions for writing is_bst *)

(* (tree_less t n) is true when all nodes of t are strictly less than n

*)

let rec tree_less (t:tree) (n:int) : bool = begin match t with

| Empty -> true

| Node(lt, x, rt) ->

x < n && (tree_less lt n) && (tree_less rt n) end

(* (tree_gtr t n) is true when all nodes of t are strictly greater than n

*)

let rec tree_gtr (t:tree) (n:int) : bool = begin match t with

| Empty -> true

| Node(lt, x, rt) ->

x > n && (tree_gtr lt n) && (tree_gtr rt n) end

The is_bstfunction uses these two helpers to directly encode the bi-nary search tree invariant as a program:

121 if you’re interested in such datastructures.

(* determines whether t satisfies the bst invariant *) let rec is_bst (t:tree) : bool =

begin match t with

| Empty -> true

| Node(lt, x, rt) ->

is_bst lt && is_bst rt &&

(tree_less lt x) && (tree_gtr rt x) end

This solution isn’t very satisfactory, though. It is very unlikely that some tree we happen to obtain from somewhere actually satisfies the bi-nary search tree invariant. Moreover, checking that the tree satisfies the invariant is pretty expensive.

A better way to construct a binary search tree, is to start with a simple binary search tree like Empty, which trivially satisfies the invariant, and theninsertordeletenodes as desired to obtain a new binary search tree.

Each operation must preserve the binary search tree invariant: given a binary search tree t as input, insert t n should produce a new binary search tree that contains the same set of elements as t but additionally containsn. Ifthappens to already containn, then the resulting tree is just

titself.

How can we implement such an insertion function? The key idea is that inserting a new element is just like searching for it—if we happen to find the element we’re trying to insert, then the result is just the input tree. On the other hand, if the input tree does not already contain the element we’re trying to insert, then the search will find a Emptytree, and we just need to replace that empty tree with a new leaf node containing the inserted element. In code:

(* ASSUMES: t is a binary search tree.

Inserts n into the binary search tree t, yielding a new binary search tree *) let rec insert (t:tree) (n:int) : tree =

begin match t with

| Empty ->

(* element not found, create a new leaf *) Node(Empty, n, Empty)

| Node(lt, x, rt) ->

if x = n then t

else if n < x then Node (insert lt n, x, rt) else Node(lt, x, insert rt n)

end

The code forinsertexactly mirrors that oflookup, except that it returns a tree at each stage rather than simply searching for the element. We have to check that the resulting tree maintains the binary search tree invariant, but this is easy to see, since we only ever insert a node n to the left of a nodexwhennis strictly less thanx. (And similarly for insertion into the right subtree.)

Deletion is more complex, because there are several cases to consider.

If the node we are trying to delete is not already in the tree, then delete can simply return the original tree. On the other hand, if the node is in the tree, there are three possibilities. First: the node to be deleted is a leaf.

In that case, we simply remove the leaf node by replacing it withEmpty. Second, the node to be deleted has exactly one child. This case too is easy to handle: we just replace the deleted node with its child tree. The last case is when the node to be deleted has two non-empty subtrees. The question is how delete the node while still maintaining the binary search tree invariant.

This requires a bit of cleverness. Observe that the left subtree must be non-empty, so it by definition contains a maximal element, call itm, that is still strictly less thann, the node to be deleted. Note also thatmis strictly less than all of the nodes in the right subtree ofn. Both of these properties follow from the binary search tree invariant. We can therefore promote m to replacen in the resulting tree, but we have to also remove m(which is guaranteed to be a leaf) at the same time.

Putting all of these observations together gives us the following code:

(* returns the maximum integer in a NONEMPTY bst t *) let rec tree_max (t:tree) : int =

begin match t with

| Empty -> failwith "tree_max called on empty tree"

| Node(_,x,Empty) -> x

| Node(_,_,rt) -> tree_max rt end

(* returns a binary search tree that has the same set of nodes as t except with n removed (if it’s there) *) let rec delete (n:int) (t:tree) : tree =

begin match t with

| Empty -> Empty

| Node(lt,x,rt) ->

if x = n then

begin match (lt,rt) with

| (Empty, Empty) -> Empty

| (Node _, Empty) -> lt

| (Empty, Node _) -> rt

| _ -> let m = tree_max lt in Node(delete m lt, m, rt) end

else

if n < x then Node(delete n lt, x, rt) else Node(lt, x, delete n rt)

end