Introducing functional programming

Roger Bailey <rb@doc.ic.ac.uk>
  1. Functions in conventional languages
  2. Programming with functions
  3. A simple Hope example — conditionals
  4. Using functions that weve defined
  5. A more interesting example — repetition
  6. Another way of using functions
  7. Other kinds of data

Functions in conventional languages

In a language like Pascal, a function is a piece of packaged program for performing standard operations like finding square roots. To obtain the square root of a positive number stored in a variable x, we write:

sqrt (x)

at the point in the program where we want the value, such as:

writeln (1.0 + sqrt (x));

this is called an application of the function. The value represented by x is called the argument or actual parameter. In this context, the canned program computes the square root of x, 1.0 is added to it and the result is then printed.

We can also define our own functions specifying how the result is computed using ordinary Pascal statements. Here's a function that returns the greater of its two argument values:

function max (x, y : INTEGER) : INTEGER;
begin
if x > y
    then max := x
    else max := y
end;

The identifiers x and y are called formal parameters. They're used inside the definition to name the two values that will be supplied as arguments when the function is applied. We can use max anywhere we need a value, just like sqrt. Here's how we might use max to filter out negative values on output:

writeln (max (z, 0));

A more interesting case is when the actual parameter is a function application itself or involves one. We can use max to find the largest of three numbers by writing:

max (a, max (b, c))

Combining functions together like this is called composition. The expression is evaluated inside-out because the outer application of max can't be evaluated until the value of its second argument is known. The inner application of max is therefore evaluated first using the values of b and c and the result is used as the actual parameter of the outer application.

Another way of combining functions together is to define more powerful ones using simpler ones as building blocks. If we often need to find the largest of three numbers we might define:

function MaxOf3 (x, y, z : INTEGER) : INTEGER;
begin
    MaxOf3 := max (x, max (y, z))
end;

and apply it by writing:

MaxOf3 (a, b, c)

Programming with functions

Pascal is called an imperative language because programs written in it are recipes for doing something. If our programs consist only of functions, we can concentrate on what the results are and ignore how they're computed. Forget that sqrt is a piece of code and think of sqrt ( x ) as a way of writing a value in your program, and you'll get the idea. You can think of MaxOf3 like this as well if you ignore the way it works inside. By defining a toolkit of useful functions and combining them together like this, we can build powerful programs that are quite short and easy to understand.

In Pascal, functions can only return simple data objects such as numbers or characters, but real programs use big data structures and can't easily be written using these functions. In Hope, functions can return any type of value, including data structures equivalent to Pascal's arrays and records and much more. Programming in Hope has the flavour of simply writing down the answer by writing an expression that defines it. This will contain one or more function applications to define smaller parts of the answer. These functions won't usually be built in like sqrt, so we'll have to define them ourselves, but we'll still think of them as definitions of data objects, and not as algorithms for computing them.

A simple Hope example — conditionals

Let's see how we can define max in Hope. Like Pascal, it's a strongly-typed language, which means we must tell the compiler about the types of all objects in our programs so it can check that they're used consistently. The function definition comes in two parts. First we declare the argument and result types:

dec max : num # num -> num ;

dec is a reserved word: you can't use it as a name. max is the name of the function being defined. Names consist of upper and lower case letters (which are distinct) and digits, and must start with a letter. The current fashion is to use lower case. The layout isn't significant and you can separate symbols with any number of blanks, tabs and newlines for clarity, as in this example. Symbols need only be separated when they might otherwise be confused as one, such as dec and max.

The next part of the declaration gives the types of the arguments (read the symbol : as takes a). Non-negative integers are of the predefined type num (in lower case). # is read as and a; (or you can use the reserved word X). -> is read as yields. The semicolon marks the end of the declaration, which tells the compiler that max takes two numbers as arguments and returns a single number as its result.

The result of a function is defined by one or more recursion equations. max needs only one equation to define it:

--- max ( x, y ) <=  if x > y then x else y ;

Read the symbol --- as the value of. The expression max ( x, y ) is called the left-hand side of the equation. It defines x and y as formal parameters, or local names for the values that will be supplied when the function is applied. Parameter names are local to the equation, so x and y won't be confused with any other x or y in the program. The symbol <= is read as is defined as.

The rest of the equation (called the right-hand side) defines the result. It's a conditional expression. The symbols if, then and else are reserved words. Pascal's conditional statement chooses between alternative actions, but Hope's conditional expression chooses between alternative values, in line with our view that function applications are ways of writing values rather than recipes for computing them. If the value of the expression x > y is true, the value of the whole conditional expression is the value of x, otherwise it's the value of y. The alternative values can be defined by any Hope expressions.

When the value of a function is defined by more than one expression like this, they are evaluated in an unspecified order. On a suitable computer, such as the Imperial College ALICE machine, it's even possible to evaluate both expressions and the test in parallel and throw away one of the values according to the result of the test.

Using functions that we've defined

A Hope program is just a a single expression containing one or more function applications composed together. It's evaluated immediately and the result and its type are printed on the screen. Here's a simple program that uses max, with its output on the line below:

max ( 10, 20 ) + max ( 1, max ( 2,3 ) ) ;
23 : num

The rules for evaluating the expression are the same as those of Pascal: function arguments are evaluated first, the functions are applied, and finally other operations are performed in the usual order of priority.

We can also use existing functions to define new ones. Here's the Hope version of MaxOf3:

dec MaxOf3 : num # num # num -> num ;
--- MaxOf3 ( x, y, z ) <= max ( x, max ( y, z ) ) ;

A more interesting example — repetition

Just as Pascal's conditional statement is replaced by Hope's conditional value, so the repetitive statement is replaced by the repetitive value. Here's a Pascal function that multiplies two numbers using repeated addition:

function mult (x, y : INTEGER) : INTEGER;
var prod : INTEGER;
begin
    prod := 0;

    while y > 0 do
    begin
        prod := prod + x;
        y := y - 1
    end;

    mult := prod
end;

It's hard to be sure this function does enough additions (it took me three tries to get it right) and this seems to be a general problem with loops in programs. A common way of checking imperative programs is to simulate their execution. If we do this for input values of 2 and 3, we'll find that prod starts with the value 0 and gets values of 2, 4 and 6 on successive loop iterations, which suggests the definition is correct.

Hope doesn't have any loops, so we must write all the additions that the Pascal program performed in a single expression. It's much easier to see that this has the right number of additions:

dec mult : num # num -> num ;
--- mult ( x, y ) <= 0 + x + x + ...

or would be if we knew how many times to write + x. The hand simulation suggests we need to write it y times, which is tricky when we don't know the value of y. What we do know is that for a given value of y, the expressions:

mult ( x, y )

and

mult ( x, y-1 ) + x

will have the same number of + x terms if written out in full. The second one always has two terms, whatever the value of y, so we'll use it as the definition of mult:

--- mult ( x, y ) <= mult ( x, y-1 ) + x ;

On the face of it we've written something ridiculous, because it means we must apply mult to find the value of mult. Remember however that this is really shorthand for 0 followed by y occurrences of + x. When y is zero, the result of mult is also zero because there are no + x terms. In this case mult isn't defined in terms of itself, so if we add a special test for it, the definition terminates. A usable definition of mult is:

--- mult ( x, y ) <= if y = 0 then 0 else mult ( x, y-1 ) + x ;

Functions that are defined using themselves like this are called recursive. Every Pascal program using a loop can be expressed as a recursive function in Hope. All recursive definitions need one case (called the base case) where the function isn't defined in terms of itself, just as Pascal loops need a terminating condition.

Another way of using functions

Hope allows us to use a function with two arguments like mult as an infix operator. We must assign it a priority and use it as an operator everywhere including the equations that define it. The definition of mult when used as an infix operator looks like this:

infix mult : 8 ;
dec mult : num # num -> num ;
--- x mult y <= if y = 0 then 0 else x mult ( y - 1 ) + x ;

A bigger number in the infix declaration means a higher priority. The second argument of mult is parenthesised because its priority of 8 is greater than the built-in subtraction operation. Most of Hope's standard functions are supplied as infix operators.

Other kinds of data

Hope provides two other primitive data types. A truval (truth value) is equivalent to a Pascal Boolean and has values true and false. We've already seen the expression x > y defining a truth value. > is a standard function whose type is num # num -> truval. We can use truth values in conditional expressions and combine them together with the standard functions and, or and not.

Single characters are of type char, with values 'a', 'b' and so on. Characters are most useful as components of data structures such as character-strings.