Library GenuineShiftReset

The Proved Program of The Month - Type-safe printf via delimited continuations.



In this development we define the generalized monad typeclass and one particular instance: the continuation monad with variable input and output types. The latter lets us define shift and reset delimited control operators with effect typing, answer-type modification, and polymorphism. As an application of these operators, we build a type-safe sprintf.

This is joint fun with Oleg Kiselyov who implemented this in Haskell (http://okmij.org/ftp/Haskell/ShiftResetGenuine.hs, http://www.haskell.org/pipermail/haskell/2007-December/020034.html)

which itself is based on the lambda-sr-let calculus presented in:

AK07, Polymorphic delimited continuations, Asai and Kameyama, APLAS 2007, http://logic.cs.tsukuba.ac.jp/~kam/paper/aplas07.pdf

The type-safe sprintf and its typing problems (resolved in this development) are described by Kenichi Asai in http://pllab.is.ocha.ac.jp/~asai/papers/tr07-1.ps.gz

Brought to Coq and documented by Matthieu Sozeau(at lri.fr). Reviewed and corrected by Oleg Kiselyov.

This file can be processed using the latest coq version available (svn checkout svn://scm.gforge.inria.fr/svn/coq/trunk) and requires first to get and compile the Haskell prelude port available at: darcs get http://www.lri.fr/~sozeau/repos/coq/monads

Life is much easier without arguments getting in the way !

Set Implicit Arguments.
Unset Strict Implicit.

The generalized monad GMonad

A generalized monad (or parameterized monad, or indexed monad) is a monad parameterized by two types, input and output that can change during computations. However two monadic computations compose only if the output of the first one is the input of the other one. The laws are just the same.

In more detail, the generalized monads are described in Robert Atkey, Parameterised Notions of Computation, Msfp2006 http://homepages.inf.ed.ac.uk/ratkey/param-notions.pdf see also http://www.haskell.org/pipermail/haskell/2006-December/018917.html

Class GMonad (m : Type -> Type -> Type -> Type) :=
  gunit : forall {i a} (x : a), m i i a ;

  gbind : forall {i o' o a b}, m o' o a -> (a -> m i o' b) -> m i o b ;

  gbind_gunit_left : forall i o a b (x : a) (f : a -> m i o b),
    gbind (gunit x) f = f x ;

  gbind_gunit_right : forall i o a (x : m i o a), gbind x gunit = x ;

  gbind_assoc : forall i o' o'' o a b c (x : m o'' o a) (f : a -> m o' o'' b) (g : b -> m i o' c),
    gbind x (fun a : a => gbind (f a) g) = gbind (gbind x f) g.

We define overloaded notations for all generalized monads, resembling the usual monad notations.

Notation "T >>== U" := (gbind T U) (at level 20).
Notation "x <-- T ;; E" := (gbind T (fun x : _ => E)) (at level 30, right associativity).
Notation "'ret' t" := (gunit t) (at level 20).

The join of a generalized monad.

Definition gjoin [ GMonad m ] {I O O' A B} (e1 : m O' O (A -> m I O' B)) (x : A) : m I O B :=
  f <-- e1 ;; f x.

We can trivially embed any monad in a general monad, ignoring the extra input/output types. The proofs of the laws reduce to the proofs for the monad, hence they are automatically solved.

Require Import Prelude.Monad.

Program Instance [ Monad m ] => monad_gmonad : GMonad (fun input output => m) :=
  gunit i a x := unit x ;
  gbind i out out' a b m f := bind m f.

The continuation monad cont.

A cont I O A object is a function, which takes a continuation with the answer-type I consuming objects of the type A. The result, the final answer, is an object of the type O. In general, I is not the same as O, hence our cont I O A permits answer-type modification. See AK07 for more explanations. The continuation monad is fully polymorphic in the object, input and output type, hence the "final answer" type O of the continuation is not fixed, and neither is I. We will define combinators that are indeed instantiated at different types in a single expression.

In lambda-sr-let calculus of AK07, a cont I O A is the type of terms t whose evaluation may incur a control effect. The types appear in judgments of the form G; I |- t : A; O. The judgment asserts that the term t has type A and that the answer type is transformed from I to O by t. The correspondence to Danvy and Filinski's function space with answer types is as follows: a function A / I -> B / O is represented here as (A -> cont I O I) -> cont B B O. This is the type of functions from A to B, which also transform the answer type of the whole term in its context from the type I to the type O. Frequently A is instantiated with (), hence we can change the notation to: I ---> B / O for computations which produce B objects while transforming the answer type from I to O. We'll see how it gets instantiated for sprintf later on.

Definition cont I O A := (A -> I) -> O.

Notation " { I ---> B | O } " := (cont I O I -> cont B B O) (at level 50) : type_scope.

GMonad cont

The type constructor cont is a generalized monad, as shown below. The implementation is the same as the continuation monad, only the types of the operations are more general.

Program Instance cont_gmonad : GMonad cont :=
  gunit I A x := fun k => k x ;
  gbind I O' O A B m f := fun k => m (fun a => f a k).

  Next Obligation.
  Proof.
    extensionality k.
    reflexivity.
  Qed.

  Next Obligation.
  Proof.
    extensionality k.
    f_equal.
    rewrite eta_expansion.
    reflexivity.
  Qed.

shift and reset

The monad lets us implement delimited control operators shift and reset with answer-type modifications and polymorphism. The operators can realize complex control structures. The types of these combinators implement the rules of the calculus as defined in AK07 (fig. 3).

The reset combinator takes a computation of the type sigma modifying the answer-type from sigma to tau. The combinator yields a pure computation of the type tau: That computation has no control effects, hence both the input and output answer types are the same, a, and it is polymorphic over a.

Definition reset {sigma tau a} (c : cont sigma tau sigma) : cont a a tau :=
  fun k => k (c id).

The shift combinator is third-order, hence not too easy to explain... shift f captures the continuation f and gives back a continuation producing a b.

Definition shift {tau t a s b} (f : (tau -> cont t t a) -> cont s b s) : cont a b tau :=
  fun k => f (fun tau => gunit (k tau)) id.

The run function actually runs a continuation using the id continuation to start with.

Definition run {T} (k : cont T T T) : T := k id.

Examples

Using these combinators we can define all examples of the AK07 paper.

appnd

First we define appnd, given section 2.1, which returns a continuation to append a list to another list.

Require Import Coq.Lists.List.

Returns a continuation which takes a function from list A to T to produce a function from list A to cont T' T' T for any T, T'.

Fixpoint appnd {A T T'} (l : list A) : cont T (list A -> cont T' T' T) (list A) :=
  match l with
    | [] => shift (fun k => gunit k)
    | hd :: tl => appnd tl >>== fun tl' => ret (hd :: tl')
  end.

The values are usually polymorphic on the input and output types so that we can plug them to other monadic expressions. However the type of the result term is specified: it's a function from lists of A to continuations producing lists of A.

Example appnd123 {A T} : cont A A (list nat -> cont T T (list nat)) :=
  reset (appnd [1;2;3]).

We can look at the actual computation defined by this expression using compute to get the normal form of a term.

Eval compute in @appnd123.

The notation convertible x y builds a proof of reflexivity of equality of x and asserts that this is a proof of x = y. This will typecheck only if x and y are actually convertible. It is useful when used together with Check which just typechecks a term. A Check (convertible x y) command succeeds if and only if x and y are convertible.

Notation " 'convertible' x y " := ((refl_equal x) : x = y) (at level 0, x at next level, y at next level).

The compute command gave us the term on the rhs here.

Check (convertible @appnd123
  fun (A T : Type) (k : (list nat -> (list nat -> T) -> T) -> A) =>
    k (fun (tau : list nat) (k0 : list nat -> T) =>
      k0 (1 :: 2 :: 3 :: tau))).

Running the computation and applying the continuation to [4;5;6] actually finishes the computation and gives [1;2;3;4;5;6] as a final result.

Example test1 : list nat := run (appnd123 >>== (fun k => k [4;5;6])).

Check (convertible test1 ([1;2;3;4;5;6])).

The sprintf function (sec 2.3 of AK07)

The paper argues this test requires both the changing of the answer type and polymorphism (fmt is used polymorphically in stest3). We first need to define some operations to build formats.

We define show functions for nat and string using the existing Show typeclass.

Require Import Prelude.Show.

Definition pr_int : nat -> string := show.
Definition pr_str : string -> string := id.

Then we define concatenation of format specifiers, using do-notation. This is simply lifting the concatenation of strings to the cont monad. We use the infix notation ^^ for format concatenation and ## for format application, which is in fact the join of the cont monad.

Definition conc {I O O'} (e1 : cont O' O string) (e2 : cont I O' string) :
  cont I O string :=
  x <-- e1 ;; y <-- e2 ;; ret (x +++ y).

Infix "^^" := conc (right associativity, at level 70).
Infix "##" := gjoin (at level 60).

The fmt combinator takes a function from A to strings and builds a computation producing a string and changing the answer type from B to A -> cont T T B for any B, T. The computation corresponding to the output answer-type takes an A, transforms it to a string using to_str and passes the result to the current continuation k.

Definition fmt {A B} (to_str : A -> string) {T} : cont B (A -> cont T T B) string :=
  shift (fun k => ret (k ∘ to_str)).

For sprintf, the continuation c must build a string in the end and takes the result elaborated to this point (of type string) as an argument. However, the output type T of the whole sprintf computation is polymorphic, depending on how its argument modifies the answer type. It will become a function type if we're using formats. We can use the notation I -> B / O of Danvy & Filinski to describe the type of sprintf. It is simply an instance of the more general reset combinator. Intuitively, it will capture the continuations built using the shift operator inside fmt and return a function which must be applied to objects of the right type to be able to construct a runnable cont object.

Definition sprintf T : { string ---> string | T } := reset.

We can now start writing examples uses of sprintf. We open the string scope to be able to write strings using the common double quote notation. There are implicit conversion going on between the string type defined in Prelude.Show as list ascii and the isomorphic string datatype defined in String. We force the coercion using the str combinator here.

Definition str (s : string) {I} : cont I I string := ret s.

Open Scope string_scope.

This example shows the type of reset applied to a cont object at a polymorphic type A. This continuation can be plugged in other computations because of this polymorphism.

Definition polymorphic_reset A : cont A A string := reset (a:=A) (str "Hello world").

On the other hand, sprintf defines a restriction of the reset operation, and the only thing we can do with an sprintf result is run it because the answer type is now fixed.

Definition monomorphic_reset : cont string string string := sprintf (str "Hello world").

Lets run it.

Example hello_world : string := run (sprintf (str "Hello world")).

Check (convertible hello_world "Hello world").

More interestingly, we can use format strings which actually change the answer type.

Example hello_world_fmt : string :=
  run (sprintf (str"Hello " ^^ fmt pr_str ^^ str"!") ## "world").

Check (convertible hello_world_fmt "Hello world!").

Let's decompose that expression into its subexpressions, everything is first class here: the format components, the corresponding function and the arguments. First, the fmt operator applied to a conversion function gives:

Definition fmt_str {T A} := fmt pr_str (T:=T) (B:=A).

An interesting point here is that we are forced to generalize the types by ourselves here. Indeed in Haskell one would not need to add the type abstractions for T and A nor the type annotations for fmt specifying the use of these types. The unbound type variables would be implicitly generalized in put in front of the type. However, we can do without the explicit instantiation using the (x:=t) syntax if we add a type signature to the definition instead.

Definition fmt_int {T A} : cont A (nat -> cont T T A) string := fmt pr_int.

We could even have a default behavior to implicitly generalize unbound variables appearing in the type of a definition to get rid of the `T A` binders, but that's a very fragile notion when the namespace of type variables is the global namespace of defined objects as in Coq (or other dependently-typed programming languages). Compare the following:

Example aformat {A T} :=
  str "The value of " ^^ fmt (T:=T) pr_str ^^ str " is " ^^ fmt (T:=T) (B:=A) pr_int.

Example aformat_with_spec {A T} : cont A (string -> cont T T (nat -> cont T T A)) string :=
  str "The value of " ^^ fmt pr_str ^^ str " is " ^^ fmt pr_int.

Clearly, in this style of putting a typing constraint, the types will get too big and writing them will be a problem. Although the explicit types may be helpful to the type-checker, they are less and less helpful to us as they become unreadable. However there is some regularity here, which we can take advantage of. Indeed the only relevant part of a format type is the types which appear on the left of the arrows. I.e., the types of the holes of the format string. Hence, we can build a type function which given a list of types constructs the corresponding type specification. This is a regular fixpoint definition in Coq.

Fixpoint format_type A T (l : list Set) : Type :=
  match l with
    | [] => A
    | hd :: tl => hd -> cont T T (format_type A T tl)
  end.

Then we can wrap it inside a toplevel cont constructor which gives a string.

Definition format_spec l := forall A T, cont A (format_type A T l) string.

We can check that it does the right thing for the aformat example.

Check (convertible (format_spec [string;nat])
  (forall A T, cont A (string -> cont T T (nat -> cont T T A)) string)).

Hence we can define formats like this:

Example aformat_with_format_spec : format_spec [string; nat] :=
  fun _ _ =>
  str "The value of " ^^ fmt pr_str ^^ str " is " ^^ fmt pr_int.

Note that we still have to abstract by A and T in the format, but we can make these arguments implicit. This permits to apply the format more easily, A and T will get instantiated by the context of application (e.g. to string if sprintf is applied).

Implicit Arguments aformat_with_format_spec [[A] [T]].

Let's use that to define a complex sprintf expression:

Example value_of : string :=
    run ((sprintf aformat_with_format_spec ## "x" ## 3)).

Check (convertible value_of "The value of x is 3").

That's all folks!