Published on 18 August 2024 by Jing Huang.
These notes are based on the book Type Theory and Formal Proof by Rob Nederpelt and Herman Geuvers.
Lambda calculus encapsulates a formation of the basic aspects of functions. Untyped lambda calculus means we do not consider the types in this chapter.
Expressions in the lambda calculus are called lambda terms. We assume the existence of an infinite set \(V=\{x,y,z,\ldots\}\) of so-called variables while the below inductive definition establishes how the set \(\Lambda\) of all lambda terms are constructed:
An inductive definition of \(\Lambda\) means that the above definitions are the only ways to construct elements of \(\Lambda\). An alternative and shorter way is to define \(\Lambda\) via abstract syntax:
The syntactical identity of two lambda terms can be denoted with the symbol \(\equiv\).
The subterms of a given lambda term form a multiset:
They follow reflexivity and transitivity. A proper subterm of \(M\) is \(L\) if \(L\not\equiv M\).
The expression have the following precedence and associativity:
Variable occurrences in a lambda term can be divided into three categories: free occurrences, bound occurrences and binding occurrences.
Binding variables are those occurred immediately after a \(\lambda\). In the construction of a lambda term, the single variable and the variables that aren’t bound by the binding variables are free variables. However, during the process of abstraction, an occurrence of \(x\) which is free in \(M\) becomes bound in \(\lambda x.M\); that is, abstraction of \(x\) over \(M\) binds all free occurrences of \(X\) in \(M\).
Then we have the following recursive rule, in which \(\mathcal F(L)\) denotes the set of free variables in the lambda term \(L\).
The lambda term is closed if \(\mathcal F(M)=\emptyset\). A closed set is also called a combinator. The set of all closed lambda terms is denoted by \(\Lambda^0\).
Functions in the lambda notation have the property that the name of the binding variable is not essential; the name is only used to express the transformation or procedure from input to output. Thus the relation alpha conversation or alpha equivalence is defined to describe this process formally. It’s based on the possibility of renaming binding (and bound) variables.
\(M^{x\to y}\) denote the result of replacing every free occurrence of \(x\) in \(M\) by \(y\). The relation \(\lambda x.M=_\alpha\lambda y.M^{x\to y}\) is called renaming.
In the renaming of \(\lambda x.M\) to \(\lambda y.M^{x\to y}\), it is prevented that the new binding variable \(y\) binds old free \(y\)s; and that any old binding \(y\) binds a new \(y\).
Alpha conversation follows compatibility, reflectivity, symmetry and transitivity. If \(M\) and \(N\) are said to be alpha convertible or alpha equivalent, then \(M\) is called an alpha variant of \(N\).
We can abstract from the names of the bound (and binding) variables, by considering lambda terms modulo alpha equivalence.
The suffix \([x:=N]\) not in lambda terms, are meant to be meta-notation called substitution, which follow:
The subterm of the form \((\lambda x.M)N\) is called a redex (from reducible expression) and subterm \(M[x:=N]\) is called the contractum (of the redex).
Note that in general, the substitutions \(M[x:=N][y:=L]\) does not describe the same lambda term as \(M[y:=L][x:=N]\) because of the left associativity (\(M[x:=N][y:=L]\equiv M[y:=L][x:=N[y:=L]]\land x\notin\mathcal F(L)\)).
We define the relation \(\to_\beta\) to be one-step beta reduction since one redex is replaced by its contractum. The further beta reduction performed after the one-step reduction can be defined using \(\twoheadrightarrow_\beta\) which generally take zero or more step. We can prove that \(\twoheadrightarrow_\beta\) is reflective and transitive.
Similar to alpha conversation, we can define beta conversation or beta equivalence. If \(M=_\beta N\) and there is an \(n\geq 0\) and there are terms \(M_0\) or \(M_n\) such that \(M_0\equiv M\), \(M_n\equiv N\) for all \(i\) such that \(0\leq i<n\):
\(\twoheadrightarrow_\beta\) extends \(\to_\beta\) to multi-steps, while \(=_\beta\) further extends \(\twoheadrightarrow_\beta\) in both directions. Thus \(=_\beta\) is reflective, symmetric and transitive.
A reduction path from \(M\) is an infinite or finite sequence \(N_0,N_1,N_2,\ldots\) that is connected using one-step beta reduction that starts from \(M\).
Suppose that for a given lambda term \(M\), we have \(M\twoheadrightarrow_\beta N_i\) and \(M\twoheadrightarrow_\beta N_2\), then there is a lambda term \(N_3\) such that \(N_1\twoheadrightarrow_\beta N_3\) and \(N_2\twoheadrightarrow_\beta N_3\). This is the Church-Rosser Theorem.
Every lambda term \(L\) has a fixed point, which can be constructed using a fixed point combinator. One instance is:
Then we have \(YL\) which is a fixed point of \(L\), since \(L(YL)=_\beta YL\), which can be shown as follows:
In order to get a firmer hold on the desired behavior of functions, we introduce types. Functions are been classified to a certain collection, and certain restrictions on the input values are given.
The simple types introduced in this chapter prevents the anomalies but are also in several senses too restrictive. We will enlarge the expressivity of the system in subsequent chapters.
We first define an infinite set of type variables \(\mathbb{V}=\{\alpha,\beta,\gamma,\ldots\}\) and then define the set of simple types \(\mathbb{T}\):
It can be expressed using abstract syntax as \(\mathbb{T}=\mathbb{V}\mid\mathbb{T}\to\mathbb{T}\).
The Greek letters \(\alpha,\beta,\gamma,\ldots\) and variants are used for type variables belonging to \(\mathbb{V}\). Arbitrary simple types are denoted using \(\sigma,\tau,\ldots\). Their outermost parentheses may be omitted and for arrow-types, they are right-associative.
Type variables are abstract representations of basic types, such as \(nat\) for natural numbers or \(list\) for lists. Arrow types represent function types, such as \(nat\to real\) which abstracts a function which input is a natural number and the returned result is a real number.
We also have statements which allow us to express something like term \(M\)
has type \(\sigma\)
in the form \(M:\sigma\). We assume we
have infinitude of variables available for each type \(\sigma\), thus we assume that each
variable \(x\) has a unique type: if \(x:\sigma\) and
\(x:\tau\), then \(\sigma\equiv\tau\).
By adding typing to the basic construction principles of lambda calculus, we have:
There are two side conditions which have to be satisfied in the typing of an application \(MN\): the left-hand side \(M\) of the application must have a function type \(\sigma\to\tau\) while the right-hand side \(N\) of the application must match with the input type \(\sigma\); the output type will then be \(\tau\). For abstraction \(\lambda x.M\), we just need the types of \(x\) and \(M\).
A term \(M\) is called typable if there is a type \(\sigma\) such that \(M:\sigma\).
To type a lambda term, we should first type its variables. We can give types to variables in two ways:
We denote the types of bound variables immediately after their introduction following a \(\lambda\) while the types of free variables are given in a so-called context (basis):
The definition of lambda terms have to be modified to incorporate type information. This new set of pre-typed \(\Lambda_\mathbb{T}\) is defined by \(\Lambda_\mathbb{T}=V\mid(\Lambda_\mathbb{T}\Lambda_\mathbb{T})\mid(\lambda V:\mathbb{T}.\Lambda_\mathbb{T})\).
A judgment has the form \(\Gamma\vdash M:\sigma\), with \(\Gamma\) a context and \(M:\sigma\) a statement.
Below we give three derivation rules to form a so-called derivation system for Church’s \(\lambda_\to\) in the premiss-conclusion format.
A pre-typed term \(\lambda_\to\) is called legal if there exist context \(\Gamma\) and type \(\rho\) such that \(\Gamma\vdash M:\rho\).
This is one of the three problems connected with judgments in type theory: to find an appropriate context and type if the term is indeed legal.
For example, we want to show \(M\equiv\lambda y:\alpha\to\beta.\lambda z:\alpha.yz\) is valid. Hence, our task is to find a type \(\rho\) such that \(\Gamma\vdash M:\rho\). First we determine the context \(\Gamma\) which \(\Gamma\equiv\emptyset\) suffices since there are no free variables in \(M\). Then we need to find \(\rho\).
Thus we can conclude that \(\lambda y.\alpha\to\beta.\lambda z:\alpha.yz\) is valid since we find \(\rho\) which is \((\alpha\to\beta)\to(\alpha\to\beta)\).
The name type checking is straightforward, it means to check the validity of a full judgment.
As an example, we can construct a derivation for \(x:\alpha\to\alpha,y:(\alpha\to\alpha)\to\beta \vdash(\lambda z:\beta.\lambda u:\gamma.z)(yz):\gamma\to\beta\).
Hence we have succeeded in giving a proper deviation of the judgment.
This section presents the third of the general problem in type theory, namely to find an appropriate term of a certain type, in a certain context. A term which belongs to a certain type is called an inhabitant of that type.
To find an inhabitant of a certain type, we can think of the type as proposition, while each inhabitant codes a proof of this proposition.
For instance, to find the term of \(A\to B\to A\), we can follow:
The above process is generally called the PAT-interpretation, where PAT means both
propositions-as-types
and proofs-as-terms
.
We give a number of definitions about context below:
An important property concerning the free variables occurring in a judgment is that if \(\Gamma\vdash L:\sigma\), then \(\mathcal F(L)\subseteq\mathcal D(\Gamma)\). This can be proved by induction on the deviation of the judgment \(\mathcal J\equiv\Gamma\vdash L:\sigma\).
We continue with three other properties which are trivial and can also be proved by induction.
The Generation Lemma, which says precisely how a certain judgment can be generated, is shown below:
It’s obvious that, in order to build a legal term, its subterms should be legal too. This is the Subterm Lemma.
Another important property which applies for Church’s \(\lambda_\to\) is that, a term may have at most one type. This ensures that the type, if exists, must be unique. Therefore we also have the uniqueness of types.
In order to be able to treat substitution, an operation at the heart of beta reduction, in \(\lambda_\to\), we have to append type information. We then have the Substitution Lemma: assume \(\Gamma',x:\sigma,\Gamma''\vdash M:\tau\) and \(\Gamma'\vdash N:\sigma\); then \(\Gamma',\Gamma''\vdash M[x:=N]:\tau\). This means that if we substitute all occurrences of context variable by a term of the same type, the result type is unchanged which is intuitively understandable.
We also have to adjust the definition for beta reduction to the (pre-typed) terms of \(\Lambda_\mathbb{T}\):
The Church-Rosser Theorem is still valid as types clearly play no role in the reduction process.
The Subject Reduction states that beta reduction does not affect typability, and even does not change the term’s type: if \(\Gamma\vdash L:\rho\) and if \(L\twoheadrightarrow_\beta L'\), then \(\Gamma\vdash L':\rho\). Proof can be done by inducting the one basic and three compatibility cases of beta reduction.
Finally, one can prove that there are no infinite reduction sequences in \(\lambda_\to\), or every legal term is strongly normalization. This is the String Normalization Theorem or Termination Theorem.
In Church’s \(\lambda_\to\), we only encounter abstraction and application on the term level, which is first order, as abstraction and application are over terms. In the present chapter, second order operations, or terms depending on types, are introduced.
This is called the second order lambda calculus, written as \(\lambda2\) for short.
To construct a function that can handle terms of different types, we can consider an arbitrary type, and use this type as another abstraction for our function \(\lambda\alpha:*.\lambda x:\alpha.x\).
This function receives a type variable \(\alpha\), while the \(*\) symbol denotes the type of all types. The function acts as a term depending on a type, thus is second ordered or polymorphic.
To type this polymorphic term, we introduce a new binder, the \(\Pi\) binder. We can then express the type of the polymorphic function created earlier as \(\Pi\alpha:*.\alpha\to\alpha\).
By an obvious extension of the alpha conversation, we obtain:
Since we allow second order abstraction, second order application and \(\Pi\) types, our derivation system for \(\lambda_\to\) has to be extended.
The second order application rule is intuitive, that two types of arbitrary type can be interchanged. This abstraction rule also corresponds to our expectations, as presented in the previous section where the \(\Pi\) types are introduced.
To describe the complete system, we have to first extend our definition of types. The abstract syntax for \(\lambda2\) types is \(\mathbb{T}2=\mathbb{V}\mid(\mathbb{T}2\to\mathbb{T}2)\mid(\Pi\mathbb{V}:*.\mathbb{T}2)\) while \(\mathbb{V}\) is the set of type variables.
Then we extend our set of pre-typed lambda terms so that second order abstraction and application are allowed:
Note that now we have two classes of variables: object variables \(V\) and type variables \(\mathbb{V}\). We also have first order application \((\Lambda_{\mathbb{T}2}\Lambda_{\mathbb{T}2})\) and second order application \((\Lambda_{\mathbb{T}2}\mathbb{T}2)\) as well as first order abstraction \((\lambda V:\mathbb{T}2.\Lambda_{\mathbb{T}2})\) from object variables and second order abstraction \((\lambda\mathbb{V}:*.\Lambda_{\mathbb{T}2})\).
The convention is similar to that in untyped and simply typed lambda calculus:
The notation of declaration should as well be extended to allow second order declarations:
The rule that all variables must be declared before they can be used motivates the following recursive definition of the \(\lambda2\) context, which the new definition of the domain of a context is combined:
Conforming with the new notion of context, we adapt the variable rule and define the formation rule which defines a properly formed \(\lambda2\) type. We assume \(\Gamma\) is a \(\lambda2\) context.
Finally, we define the legality of \(\lambda2\): a term \(M\) in \(\Lambda_{\mathbb{T}2}\) is called legal if there exists a \(\lambda2\) context \(\Gamma\) and a type \(\rho\) in \(\mathbb{T}2\) such that \(\Gamma\vdash m:\rho\).
The definition of alpha conversion is adapted and extended accommodating \(\Pi\) types:
We also extend the beta reduction in an obvious way matching the extensions of alpha conversation:
Most properties for \(\lambda_\to\) defined in the previous chapter still applies, except the Permutation Lemma, which holds only if the permuted context is a \(\lambda2\) context.
In this chapter, we discuss the way to construct generalized types, by abstracting types. This results in the system \(\lambda\underline{\omega\!}\,\).
We can handle the abstraction of types of terms similar to the way we handle arbitrary types: types like \(\beta\to\beta\), \(\gamma\to\gamma\), \((\beta\to\gamma)\to(\beta\to\gamma)\) can be generalized using a function \(\lambda\alpha:*.\alpha\to\alpha\) with type as value. This is a so-called type constructor, with type \(*\to*\) itself. Proper constructor are the type constructors which are not types.
We name the type of the type constructors consisting of \(*\) alone and of \(*\) symbols with arrows in between kind. They belong to set \(\mathbb{K}\) with abstract syntax \(\mathbb{K}=*\mid(\mathbb{K}\to\mathbb{K})\). Outermost parentheses may be omitted, and the kinds are right-associative. For the type of all kinds, a new symbol \(\square\) is introduced.
We use the word sort or symbol \(s\) for either \(*\) or \(\square\); by definition, the set of sorts is \(\{*,\square\}\).
So now there are four levels in our syntax:
First, we formalize the fact that the super-type \(*\) is of type \(\square\) using the sort rule.
The variable rule is used to establish that all declarations occurring in a context is derivable in that context. In \(\lambda_\to\), the set of permissible types was given beforehand. In \(\lambda2\), we restrict the types of the variables to the recursively defined \(\lambda2\) context. For the more complicated \(\lambda\underline{\omega\!}\,\), we combine it with the construction of the context proper.
The new approach is that we only extend a context with a declaration \(x:A\) only when the type \(A\) is permissible, which is either a type or a kind.
This rule plays a double role due to the two possibilities of \(s\).
When defining the variable rule, the type of the derivable term is restricted to level 2 and 3, since one prerequisite is that the type of the derivable term must be of type sort, which is level 4. The solution to this limitation is the addition of the weakening rule, which states the fact that the context of a judgment can be weakened by adding new declarations provided that the types of the new declarations are well-formed.
Assuming that we have derived the judgment \(\Gamma\vdash A:B\), then we may weaken the context by adding an arbitrary declaration which type is well-formed at the end. In fact, the Thinning Lemma still applies for \(\lambda\underline{\omega\!}\,\), which means insertion of new declaration is allowed at an arbitrary place, however the weakening rule is easy to express and turns out to be sufficient.
Now that we are able to derive for example \(\alpha:*,\beta:*\vdash\alpha:*\) by proof tree.
For the construction of typing statements in a context, we have the formation rule. In \(\lambda2\), the rule was based on a set \(\mathbb{T}2\) of \(\lambda2\) types. While types are more complex in \(\lambda\underline{\omega\!}\,\), the possibility of double roles makes things become easier.
The application and abstraction rules differ from the ones in the previous section because of the extended type in \(\lambda\underline{\omega\!}\,\) and the lack of \(\Pi\) types. Thus the type identifier is changed and a second premiss which the term must be well-formed is added.
The two rules both have a double role again, since \(s\in\{*,\square\}\).
We have applied beta reduction and conversation only on terms before. While the type becomes more complex in \(\lambda\underline{\omega\!}\,\), one will want an other derivation rule which acts on the types.
This rule is called the conversion rule, similar to the subject reduction but is intended for types.
The system \(\lambda\underline{\omega\!}\,\) satisfies the majority of the nice properties from the previous systems, while only the conversion rule requires a slight modification of the Uniqueness of Type Lemma, that types need no longer be literally unique, but they are unique up to conversation.
In the previous three chapters, we have met \(\lambda_\to\), the basic system which terms depending on terms; \(\lambda2\), terms depending on types; and \(\lambda\underline{\omega\!}\,\), types depending on types. In this chapter, we will treat the last extension \(\lambda P\) which types are dependent on terms.
A type depending on a term has general format \(\lambda x:A.M\), where \(M\) is a type, \(x\) is a term variable, and \(A\) is a type. This abstraction then depends on \(x\), and is often called type-valued function or simply type constructor.
For instance, let \(V_n=\{\langle v_1,\ldots,v_n\rangle\mid v_i\in\mathbb{N}\}\), a set of all natural number vectors of length \(n\), then \(\lambda n:nat.V_n\) maps \(n\) to the set of all vectors of length \(n\). It has type \(nat\to*\).
Furthermore, we can turn this type-valued function into a predicate. Take \(P_n\) to be
the proposition \(n\) is a prime number
, then \(\lambda
n:nat.P_n\) is the logical predicate to be a prime
. This is the PAT-interpretation introduced with
\(\lambda_\to\), the first step to a fruitful treatment of proof in formal logic and
mathematics.
The derivation rules of \(\lambda P\) has a great resemblance to the rules of \(\lambda\underline{\omega\!}\,\), so we present them directly.
The main differences with respect to \(\lambda\underline{\omega\!}\,\) are:
The formation rule is also called the product rule, since it enables the construction and typing of a \(\Pi\) type. Martin-Löf calls a \(\Pi\) type the cartesian product of a family of types. So \(\Pi\) types can both be seen as a generalization of the cartesian product (consider \(A=\{a_1,a_2\}\), then \(\Pi x:A.B\) is the same as \(B[x:=a_1]\times B[x:=a_2]\)) and as a generalization of the function space (if \(x\notin\mathcal F(B)\), then \(\Pi x:A.B\) is just \(A\to B\)).
We can code minimal predicate logic which only has implication and universal qualification as logic operations with basic entities propositions, sets and predicates over sets in \(\lambda P\).
We first investigate the coding of the basic entities of minimal predicate logic and apply the full PAT interpretation in the appropriate cases.
An example of minimal predication logic is given. Let \(S\) be a set and \(Q\) a binary predicate over \(S\), what we want to prove is \(\forall_{x\in S}\forall_{y\in S}(Q(x,y))\Rightarrow\forall_{u\in S}(Q(u,u))\). We define a local environment \(\Delta=\{S:*,Q:S\to S\to*\}\) for this.
This is identical to a term finding process, and the proposition is proved since we succeeded in finding an inhabitant for the type representing the proposition.