January 2019

Simple, generic, type-safe substitution in Agda

Formalizing substitution in Agda can get tedious if you need a case for every term former in your language. In this post I show how to avoid this tedium by implementing substitution generically. However, the technique I present is limited to simple type systems for λ-calculi, and does not handle polymorphism, linearity, dependence, sequent calculi, or other fancy stuff.

For lack of a better name, I call this technique premise signatures. Previously, I also showed how to encode binders in Agda by using functions as contexts. In principle these techniques are independent, but they work well together, so I will assume you have read my previous post and use functions as contexts here without explanation.

I haven't seen anyone else use premise signatures. I'd love to know if I'm reinventing a well-known technique. It reminds me a little of logical frameworks like LF and Twelf, but premise signatures are much simpler and less powerful than LF's syntax representation.

Without further ado, let's read some Agda! If you want to follow along in your editor, you can find this entire blog post as an Agda file here. First, some boring set-up:

open import Data.Bool using (Bool; true; false; if_then_else_)
open import Data.Empty using (⊥)
open import Data.Sum using (_⊎_; inj₁; inj₂; [_,_])
open import Function using (id; _∘_)
open import Relation.Binary.PropositionalEquality using (_≡_; refl)

infix  1 _⊢_ _⊩_ _⊢⋆_
infixr 2 _∧_
infixr 3 _▷_
infixr 4 _∷_ _∪_ _!_
infixr 5 _=>_ _,_
infixr 6 _×_ _+_

Our type language will be richer this time, to demonstrate that adding features to the language won't complicate our definition of substitution.

data Type : Set where
  void unit : Type
  _=>_ _×_ _+_ : (A B : Type) -> Type

We use functions as contexts:

Cx : Set1
Cx = Type -> Set

∅ : Cx
∅ _ = ⊥

_∪_ : (Γ Δ : Cx) → Cx
(Γ ∪ Δ) A = Γ A ⊎ Δ A

singleton : Type → Cx
singleton A B = A ≡ B

_∷_ : Type → Cx → Cx
A ∷ Γ = singleton A ∪ Γ

_⊇_ : (Γ Δ : Cx) → Set
Γ ⊇ Δ = ∀{a} → Δ a → Γ a

Now we come to the meat of the subject. A value of type Sig ─ a premise signature ─ represents the types of the arguments to a term former, or equivalently, the premises of a particular typing rule.

data Sig : Set1 where
  -- No arguments/premises.
  nil : Sig

  -- A pair of arguments or conjunction of premises.
  _∧_ : (P Q : Sig) → Sig

  -- A binder; a premise P beneath a context Γ of newly
  -- bound variables.
  _▷_ : (Γ : Cx) (P : Sig) → Sig

  -- A term of a given type.
  term : (A : Type) → Sig

Note that in term, we don't provide a context; the conclusion's typing context is implicitly threaded through all premises. This is the crucial choice that lets substitution be implemented generically. It's also why this particular definition of Sig can't handle linear or modal logics.

For example, let's consider the rules for pairs and functions.

Γ ⊢ M : A   Γ ⊢ N : B
────────────────────── pair
 Γ ⊢ (M,N) : A × B

 Γ,x:A ⊢ M : B
───────────────── lambda
Γ ⊢ λx.M : A → B

We can represent these rules' premises like so:

pair-premise λ-premise : (A B : Type) → Sig
pair-premise A B = term A ∧ term B  
λ-premise A B = singleton A ▷ term B

Note that we aren't representing whole rules here, just their premises. To represent a typing rule, we also need its conclusion. For our purposes, the conclusion of a typing rule is characterized by the type of the term it produces; the context is left implicit. We represent inference rules, or term formers, as constructors of the type P ⊩ A, where P is the premise signature and A the type of the conclusion, i.e. of the term formed. To demonstrate that our definition of substitution really is generic, we'll encode a moderately complex term syntax:

data _⊩_ : Sig → Type → Set where
  lam : ∀{A B} → singleton A ▷ term B ⊩ A => B
  app : ∀{A B} → term (A => B) ∧ term A ⊩ B
  pair : ∀{A B} → term A ∧ term B ⊩ A × B
  proj : ∀{A B} (left : Bool)
      → term (A × B) ⊩ (if left then A else B)
  inj : ∀{A B} (left : Bool)
      → term (if left then A else B) ⊩ A + B
  case : ∀{A B C}
       → term (A + B)
         ∧ singleton A ▷ term C
         ∧ singleton B ▷ term C
         ⊩ C
  duh : nil ⊩ unit            -- unit introduction
  wat : ∀{A} → term void ⊩ A  -- void elimination
  -- General recursion at any type, because why not?
  fix : ∀{A} → singleton A ▷ term A ⊩ A

Now, at last, we can define the type of well-typed terms in a given context. In fact, we'll do more than that: we'll define the type of well-typed arguments to a term former ─ in other words, of proofs of the premises of an inference rule.

data _⊢_ (Γ : Cx) : (P : Sig) → Set1 where
  -- It's easy to prove nothing.
  tt : Γ ⊢ nil

  -- To prove a pair of premises, prove them both.
  _,_ : ∀{P Q} → Γ ⊢ P → Γ ⊢ Q → Γ ⊢ P ∧ Q

  -- To prove something under a binder, introduce new
  -- variables to the context.
  bind : ∀{Δ P} → Δ ∪ Γ ⊢ P → Γ ⊢ Δ ▷ P

  -- To supply a term of type A, use a term former and
  -- prove its premises.
  _!_ : ∀{P A} (form : P ⊩ A) (args : Γ ⊢ P) → Γ ⊢ term A

  -- Or, you can use a variable from the context.
  var : ∀{A} → Γ A → Γ ⊢ term A

Let's see some examples of terms written in this syntax:

-- x : unit ⊢ x : unit
ex1-var : singleton unit ⊢ term unit
ex1-var = var refl

-- ∅ ⊢ λx.x : unit => unit
ex2-lam : ∅ ⊢ term (unit => unit)
ex2-lam = lam ! bind (var (inj₁ refl))

-- x : unit ⊢ (x,()) : unit × unit
ex3-pair : singleton unit ⊢ term (unit × unit)
ex3-pair = pair ! var refl , (duh ! tt)

-- ∅ ⊢ fix x. x : A
ex4-rec : (A : Type) → ∅ ⊢ term A
ex4-rec A = fix ! bind (var (inj₁ refl))

Okay, now that we have our notion of term syntax (or, I suppose, term former argument syntax), let's implement renaming for it. This follows the same pattern seen in my earlier article.

rename : ∀{Γ Δ P} → Γ ⊇ Δ → Δ ⊢ P → Γ ⊢ P
rename σ tt = tt
rename σ (M , N) = rename σ M , rename σ N
rename σ (bind M) = bind (rename (Data.Sum.map id σ) M)
rename σ (form ! M) = form ! rename σ M
rename σ (var x) = var (σ x)

Note that we never had to mention any of our term formers! Okay, but maybe that was just a fluke. Let's move on to simultaneous substitutions.

_⊢⋆_ : (Γ Δ : Cx) → Set1
Γ ⊢⋆ Δ = ∀{a : Type} → Δ a → Γ ⊢ term a

Again, we need a lemma for substituting under a binder that leaves the newly bound variable(s) alone:

∪/⊢⋆ : ∀{Γ Δ Ψ} → Γ ⊢⋆ Δ → Ψ ∪ Γ ⊢⋆ Ψ ∪ Δ
∪/⊢⋆ σ = [ var ∘ inj₁ , rename inj₂ ∘ σ ]

Finally, substitution:

subst : ∀{Γ Δ P} → Γ ⊢⋆ Δ → Δ ⊢ P → Γ ⊢ P
subst σ tt = tt
subst σ (P , Q) = subst σ P , subst σ Q
subst σ (bind P) = bind (subst (∪/⊢⋆ σ) P)
subst σ (form ! P) = form ! subst σ P
subst σ (var x) = σ x

As in rename, there is no mention of either types or term formers: we don't need individual cases for lam, app, pair, proj, fix, etc. This is a generic implementation of type-safe substitution for any simple type theory!

This generality should not be overstated, however. I have managed to extend this technique to handle a modal type theory similar to Pfenning-Davies S4, but not to polymorphic, linear, or dependent types. It's also specific to λ-calculi (or natural deduction style) rather than sequent calculi; it cannot represent rules that analyse the types in the context.