# Partial Functions: option types

Consider the problem of computing the maximum integer from a list of integers.

At first sight, such a problem seems simple—we simply look at each element of the list and retain the maximum. We can start to program this functionality very straightforwardly by using the by-now-familiar list recursion pattern:

let rec list_max (l:int list) : int = begin match l with

| [] -> (* what to do here? *)

| x::tl -> max x (list_max tl) end

Unfortunately, this program doesn’t have a good answer in the case that the list is empty—there is no “maximum” element of the empty list.

What can we do? One possibility is to simply have the list_maxfunction fail whenever it is called on an empty list. This solution requires that we handle three separate cases, as shown below:

let rec list_max (l:int list) : int = begin match l with

| [] -> failwith "list_max called on []"

| x::[] -> x

| x::tl -> max x (list_max tl) end

The cases cover the empty list, the singleton list, and a list of two-or more ele-ments. We have to separate out the singleton list as a special case because it is the first length for which thelist_maxfunction is well defined.

This solution is OK, and can be very appropriate if we happen to know by external reasoning thatlist_max will never be called on an empty lists. We saw such use of failure in the tree_maxfunction used to find the maximal element of

a nonempty binary search tree in §6. However, what if we can’t guarantee that the

list_max function won’t be called on an empty list? How else could we handle this possibility without failing, and thereby aborting the program1

The problem is that list_maxis an example of a partial function—it isn’t well-defined for all possible inputs. Other examples of partial functions that you have encountered are the integer division operation, which isn’t defined when dividing by0, and theMap.findfunction, which can’t return a good value in the case that a key isn’t in its set of key–value bindings.

It turns out that we can do better than failing, and, moreover, we already have all the tools needed. The idea is to create a datatype that explicitly represents the absence of a value. We call this datatype the'a optiondatatype, and it is defined like this:

type 'a option =

| None

| Some of 'a

There are only two cases: either the value is missing, represented by the con-structor None, or the value is present, represented by Some v. As with any other datatype, we use pattern matching to determine which case occurs.

Option types are very useful for representing the lack of a particular value. For a partial function like list_max, we can alter the return type to specify that the result is optional. Doing so leads to an implementation like this:

(* NOTE: this version has a different return type! *) let rec list_max (l:int list) : int option =

begin match l with

| [] -> None (* indicates partiality *)

| x::tl -> begin match (list_max tl) with

| None -> Some x

| Some m -> Some max x m end

end

As you can see, this implementation handles the same three cases as in the one that uses failwith; the difference is that after the recursive call we must explic-itly check (by pattern matching against the result) whether list_max tl is well defined.

At first blush, this seems like a rather awkward programming style, and the need to explicitly check forNoneversusSome vonerous. However, it is often

possi-1OCaml, like Java and other modern languages, supports exceptions that can be caught and handled to prevent the program from aborting in the case of a failure. Exceptions are another way of dealing with partiality; we will cover them later in the course.

ble to write the program in such a way such checks can be avoided. For example, here is a cleaner way to write the samelist_maxfunction by usingfold:

let rec list_max (l:int list) : int option = begin match l with

| [] -> None

| x::tl -> Some (fold max x tl) end

The expressionfold max x tltakes the maximum element from among those intlandx, which is always well-defined.

It is also worth pointing out that because the type 'a option is distinct from the type'a, it is never possible to introduce a bug by confusing them—OCaml will force the programmer to do a pattern match before being able to get at the'avalue in an'a option.

For example, suppose we wanted to find the sum of the maximum values of each of two lists. We can write this program as:

let sum_two_maxes (l1:int list) (l2:int list) : int option = begin match (list_max l1, list_max l2) with

| (None, None) -> None

| (Some m1, None) -> Some m1

| (None, Some m2) -> Some m2

| (Some m1, Some m2) -> Some (m1 + m2) end

Here we are forced to explicitly think about what to do in the case that both lists are empty—here we make the choice to make sum_two_maxes itself return an int option. The option types prevent us from mistakenly trying to naively do (list_max l1) + (list_max l2), which would result in a program crash (or worse) if permitted.

In languages like C, C++, and Java that have anullvalue which can be given any type, it is an extremely common mistake to conflatenull, which should mean

“the lack of a value,” with “an empty value” of some type. For example, one might try to represent the empty list asnull. However, such conflation very often leads to so-called “null pointer exceptions” that arise because some part of the program treats anullas though it has type'a, when it is really meant to be the Noneof an

'a option.

There is a big difference between “the absence of a list” and “an empty list”—

it makes sense to insert an element into the empty list, for example, but it never makes sense to insert an element into “the absence of a list.”

Sir Tony Hoare, Turing-award winner and a researcher scientist at Microsoft, invented the idea of null in 1965 for a language called ALGOL W. He calls it his

“billion-dollar mistake” because of the amount of money the software industry has spent fixing bugs that arise due to unfortunate confusion of the'aand'a option types. Option datatypes provide a simple solution to this problem.

We will see that option types also play a crucial role when we study linked, mutable datastructures in §16.