OCaml's GADTs for proofs
<return to the blog>First, what is a GADT?
GADTs (or Generalized Algebraic Data Types) are a stronger form of algebraic data type (ADT).
But don’t get scared by the complicated name: an algebraic data type is just an enum — a type with a list of constructors, each possibly carrying some data.
ADTs are everywhere. From the option type to lists, they are one of the main selling points of functional programming languages.
Consider the definition of the option type in OCaml:
type 'a option =
| None
| Some of 'a
Here, the type 'a option is parameterized by a type 'a.
The constructors are:
None: of type'a optionSome: takes a value of type'aand returns a'a option
Notice something important: at the end, both constructors return the same type 'a option.
This is a restriction of standard ADTs.
But what if different constructors could return different types?
That is exactly what GADTs allow.
type _ option =
| None : unit option
| Some : 'a -> 'a option
Here, the underscore _ means that the type parameter is not fixed globally.
Each constructor can refine it independently:
Nonehas typeunit optionSomehas type'a -> 'a option
This means that the type now carries more precise information.
For example, there is no value of type int option corresponding to “empty”: the only empty value is None, which has type unit option.
How to abuse that?
OCaml is surprisingly permissive with GADTs: as long as constructors are typed using arrows that eventually return your type, it accepts them.
But since deductive rules are also a set of rules of the shape "if X then Y", we can encode simple proof systems using GADTs.
Let’s build a simple predicate le representing $\le$ on natural numbers.
We encode natural numbers using a unary encoding (nested pairs).
The rules are:
- Reflexivity: for all $x$, $x\le x$
- Successor: if $x \le y$ then $x \le y+1$
type (_,_) le =
| Refl : ('a,'a) le (* Read "'a <= 'a" *)
| Succ : ('a,'b) le -> ('a,('b * 'b)) le (* Read "'a <= 'b implies 'a <= 'b+1" *)
This is already a proof system:
values of type (a,b) le are proofs that $a \le b$.
For example, a proof that $1 \le 2$ would just be :
let one = ((),()) (* The type is a 1-nested tuple, representing the number 1 *) in
let one_le_two = Succ (Refl one) (* The type is (1,2) le *)
Building bigger proofs
Let’s now build a more interesting example: addition.
We define natural numbers and a predicate add such that
(x,y,z) add means $x + y = z$.
Types zero and 'a succ are only used to define a type that take 0 and 1 type parameter respectively.
We don't care about the constructor at all, and we could've used unit and list instead.
type zero = _Invisible1
type 'a succ = _Invisible2 of 'a
type _ is_nat =
| IsZ : zero is_nat
| IsS : 'a is_nat -> 'a succ is_nat
type (_,_,_) add =
| AddZ : (zero, 'b, 'b) add
| AddS : ('a, 'b, 'c) add -> ('a succ, 'b, 'c succ) add
The rules correspond exactly to the typical inductive definition of addition:
- $\forall b, 0+b=b$
- If $a+b=c$ then $(a+1)+b=c+1
We can now write functions that construct proofs using GADTs.
The add_Z lemma proove that $\forall x\in \mathbb{N}, x+0=x$ by induction (=recurrence) by matching on $x$.
The add_succ_r lemma proove that $\forall a,b,c \in \mathbb{N}, a+b=c \implies a+(b+1)=c+1$.
let rec add_Z : type x. x is_nat -> (x, zero, x) add = function
| IsZ -> AddZ
| IsS n -> AddS (add_Z n)
let rec add_succ_r : type a b c.
(a,b,c) add -> (a,b succ,c succ) add = function
| AddZ -> AddZ
| AddS p -> AddS (add_succ_r p)
And now, the fun part: proving commutativity.
let rec add_comm : type a b c.
a is_nat -> b is_nat -> (a,b,c) add -> (b,a,c) add
= fun a b p ->
match a, p with
| IsZ, AddZ ->
add_Z b
| IsS a', AddS p' ->
add_succ_r (add_comm a' b p')
Conclusion
GADTs let us encode logical systems directly in OCaml, but only as predicates.
Modules and functors might allow us to capture some form of type-level transformations, but without true type reduction or dependent types, this remains quite heavily constrained.
OCaml is definitely one of the object-oriented programming languages of all time.