Coda's Blog | Set Habitation Problem

Set Habitation Problem

Representing infinite sets

In OCaml, one possible way to represent an infinite subset of a type 'a is as a predicate that always terminates:

type 'a set = 'a -> bool

This predicate is the characteristic function of the set. In that framework, a value s : 'a set is a function which tells us whether an element belongs to the set. We'll here only consider always terminating and deterministic functions.
For example, the set of all even integers can be written as:

let even : int set =
  fun n -> n mod 2 = 0

Set operations can also be written directly as operations on the output boolean: for example, union, intersection, complement are:

let union (a : 'x set) (b : 'x set) : 'x set =
  fun x -> a x || b x

let inter (a : 'x set) (b : 'x set) : 'x set =
  fun x -> a x && b x

let compl (a : 'x set) : 'x set =
  fun x -> not (a x)

The Set Habitation Problem

For a given type X, we will consider the Set Habitation Problem on X that test if a set is non-empty:

Given a s : X set, is there some x : X such that s x = true?

The Set Habitation Problem is as hard as the Set Emptyness Problem (which test if a set is empty), as solving one gives us an algorithm for the second one, just by negating the result. But since we'll show that when the habitation is decidable, such a x : X is actually computable, we'll mostly consider the Set Habitation Problem.
For example, if the type X is finite (a finite amouts of possible values), the Set Habitation Problem on X is decidable. Denote by $x_1,\ldots,x_n$ all the values of type X. Then we can write:

let is_nonempty_X (s : X set) : bool =
  s x1 || s x2 || ... || s xn

Infinite Types

Infinite types are really weird, and the decidability of the Set Emptiness Problem depends on the structure of the type. Some infinite types behave badly, some constructions preserve decidability, and function types, surprisingly, only care about the image set.

The `int` case

For int, there is no terminating algorithm which decides, for every predicate int set, whether it is nonempty.
Suppose by contradiction that such a function exists:

val is_nonempty_int : int set -> bool

We'll show that the halting problem would be decidable. For this, we consider the following program

let halting_solver (f : unit -> unit) : bool =
  let stopped_after_n_steps : int set =
    fun n -> is_stopped_after_n_steps f n
  in
  is_nonempty_int stopped_after_n_steps

Where

is_stopped_after_n_steps : (unit -> unit) -> int -> bool

is the computable function that tests whether a function f f terminates after at most n steps (or seconds) of computation. This function is computable by just simulating f for n steps (or seconds).
Then halting_solver f returns true iff stopped_after_n_steps is nonempty iff there exists some n such that f terminates after at most n steps, iff f terminates.
So halting_solver would decide the halting problem, which is a contradiction. Therefore:

The Set Habitation Problem on int is undecidable.

Product types, decidable case

If either A or B is empty, the set habitation problem on A * B is decidable (a set is always empty). We'll now suppose that both are not empty.
We'll show that the Set Habitation Problem on A * B is decidable iff the Set Habitation Problem is decidable on both A and B.
First, let's do the decidable case. Suppose we have two types A and B such that we can decide nonemptiness for sets of A and for sets of B:

val is_nonempty_A : A set -> bool
val is_nonempty_B : B set -> bool

Then we can decide nonemptiness for sets of pairs A * B using the following program :

let is_nonempty_product (s : ('a * 'b) set) : bool =
  is_nonempty_A (fun a ->
    is_nonempty_B (fun b ->
      s (a,b)))

The idea is that testing if $S \subseteq A \times B$ is empty is the same as asking whether there exists $a \in A$ such that there exists a $b \in B$ such than $(a,b) \in B$, and both "exists" are predicate over set.
Formally, $$ \begin{align*} S \neq \emptyset &\iff \{a \in A | \exists b \in B, (a,b) \in S\} \neq \emptyset \\ &\iff \{a \in A | \{b \in B | (a,b) \in S\} \neq \empty\} \neq \emptyset \\ &\iff \mathtt{is\_nonempty\_A} \{a \in A | \{b \in B | (a,b) \in S\} \neq \empty\} \\ &\iff \mathtt{is\_nonempty\_A} \{a \in A | \mathtt{is\_nonempty\_B} \{b \in B | (a,b) \in S\}\} \end{align*} $$

Product types, undecidable case

Since we consider non empty types $A,B$, let $a \in A$ and $b \in B$. Suppose that the set habitation problem is undecidable on A or on B. We'll show by contraposate that it is undecidable for A * B. Suppose that the habitation problem is decidable on A * B by a function val is_nonempty_AB : (A * B) set -> bool, and consider the following two programs :

let is_nonempty_A (setA: A set) : bool =
  is_nonempty_AB (fun (a,_) -> setA a)

let is_nonempty_A (setB: B set) : bool =
  is_nonempty_AB (fun (_,b) -> setB b)

Then, the set habitation problem is decidable on both A and B. Let $S \subseteq A$, we have $$ \begin{align*} S \neq \emptyset &\iff \{(a,b) \in A \times B | a \in A \} \neq \emptyset\\ &\iff \texttt{is\_nonempty\_AB} \{(a,b) \in A \times B | a \in A \} \end{align*} $$ Which is the set tested on the program. The same reasoning apply for $S' \subseteq B$.
This alongside the previous section gives us the proof of the following, for $A,B$ non empty types :

The Set Habitation Problem on A * B is decidable iff the Set Habitation Problem is decidable on both A and B.

Function types

Given two non-empty types A and B, we'll show that the Set Habitation Problem on A -> B is decidable iff it is for B. Surprisingly, it absolutly not depends on A

The example of `int set set`

The suprising result is that, although the Set Habitation Problem is undecidable on int set, it is decidable on int set set.
Let's first try to build an intuition on what a value of type int set set actually is. Here is an example of two sets :

let intsetset_ex1 is = if is 42 then if is 53 then false else is 78 && is 45 else false;;
let intsetset_ex2 is = if is 42 then if is 53 then false else is 78 && is 53 else false;;

Notice that intsetset_ex1 is a non-empty set as the function that maps $42 \mapsto \top; 53 \mapsto \bot; 78 \mapsto \top; 45 \mapsto \top;$ is inside, but intsetset_ex2 is empty as $f(53)$ need to both be true the first time it is called and false the second.
The intuition is that the decision tree of int set set might actually be computable. Indeed, a value of int set set is a function val set : (int -> bool) -> bool that take as input a function val f: (int -> bool) and evaluates it multiple times until it decide to return true or false. Consider its call tree : $$ \begin{array}{c} f(a_0) \\[0.8em] \begin{array}{ccccc} & \scriptstyle \top\swarrow & & \searrow\scriptstyle \bot & \end{array} \\[-0.2em] \begin{array}{ccccc} f(a_1) & & & & f(a_2) \end{array} \\[0.8em] \begin{array}{ccccccccc} \scriptstyle \top\swarrow & & \searrow\scriptstyle \bot & & & \scriptstyle \top\swarrow & & \searrow\scriptstyle \bot \end{array} \\[-0.2em] \begin{array}{ccccccccc} \dots & & \dots & & & & \dots & & \dots \end{array} \end{array} $$ Since we know that set is always terminating, it means that there is no infinite branch. Since the argument would only return true or false, it is a finite branching tree with no infinite branch, so it is a finite tree. We can therefore just map out this tree and search in it for a leaf with $\top$ !
But how can we explore this tree exaustively? Well, the idea is to explore it recursively with the following algorithm for val is_nonempty_intsetset iss:

If iss doesn't call its argument we have an empty tree : in this case the set is empty iff iss f (for any function) is true.
Otherwise, grab the first call made to f in iss f. Suppose it's $n \in \N$. We now construct two subset : one where $f(n) = \mathtt{true}$ for its first call to the argument and one where $f(n) = \mathtt{false}$. We can just evaluate recursively the or of "is the two non-empty".

This gives us the following program. The offset function take a function and a n and y and map $f(n) = y$ while leaving the rest unchanged. It assure that all arguments of the left (or right) subtree goes on the left (or right).

let offset f n0 y =
  fun n ->
    if n = n0 then y else f n
;;

let rec is_empty_intsetset (iss:int set set) : bool = 
  let exception FirstCall of int in
  let get_first_call = (fun x -> raise (FirstCall x)) in 
  try 
    iss get_first_call
  with FirstCall x ->
    (* Redefine two `iss` that offset the argument *)
    let subtree_left (f:int set) = iss (offset f x true) in
    let subtree_right (f:int set) = iss (offset f x false) in
    is_empty_intsetset subtree_left || is_empty_intsetset subtree_right;;

The functional uncomputable case

We show that if Set Emptiness Problem is uncomputablee on B then it also is on A -> B. Recall that both A and B are assumed to be non empty, and let val a : A be a value of A.
Suppose that the following program decide the Set Habitation Problem on A -> B:

val is_nonempty_AB : (A -> B) set -> bool

Then, consider the following program for the Set Habitation Problem on B:

let is_nonempty_B (b_set : B set) : bool =
   is_nonempty_AB (fun ab_set -> b_set (ab_set a))

The algorithm is correct as, for $S \subseteq B$, we have $$ S \neq \emptyset \iff \{ f : A \to B | f(a) \in B \} \neq \emptyset $$

In particular, because the Set Habitation Problem is undecidable on int, the type bool -> int or (int -> int) -> (int -> int) also has undecidable Set Habitation Problem.

The general decidable case

We'll try to generalise the int set set = (int -> bool) set case to a (A -> B) set where the Set Habitation Problem on B is decidable. Suppose that we have a function to decide the Set Habitation Problem on B:

val is_nonempty_B : B set -> bool

The main difference with the int set set case is the decision tree can branch infinitely and therefore be infinite. But we already have an oracle (is_nonempty_B) to test if a (possibly infinite) set of good branches to test is empty or not, so we can delegate the "exaustive search" to it !
Therefor, consider the following program:

let offset f a0 b =
  fun a ->
    if a = a0 then b else f a

let rec is_empty_AB (set_AB:(A -> B) set) : bool = 
  is_empty_B (fun b ->
    let exception FirstCall of A in
    let get_first_call = (fun a -> raise (FirstCall a)) in 
    try 
      set_AB get_first_call
    with FirstCall a ->
      let new_AB_set (f:A -> B) = set_AB (offset f a b)
      in is_empty_AB new_AB_set);;

If Set Emptiness is decidable for B, then Set Emptiness is decidable for A -> B.

Conclusion

The set Habitation problem being decidable depends on the structure of the type:

Type	Set Habitation Problem	Reason
Empty type `X`	Decidable	A set is always empty
Finite type `X`	Decidable	Try every element of `X`
`int` (also `string`, `float`)	Undecidable	The set $\{n \in \N \| f\space \mathrm{terminates\ in}\space n\space \mathrm{steps}\}$ is undecidable
`A * B`	Decidable iff `A` and `B` are decidable	Undecidability by projection, decidability by the set of $a$ such than the set of $b$ is non-empty
`A -> B`	Decidable iff `B` is decidable (for nonempty A)	Undecidability by the constant functions, decidability by a search of the important children of the call tree

Infinite objects are fake. Laziness is real.

References

The original haskell blog post I built upon
Thanks a lot to Tito Nguyen for making me discover this wonderful problem
Thanks also to Seo Jin for the moral support and help :D