A brief discussion on design choices implementing DTalC.
One of the problems with the original Data Types a la Carte implementation,
was that the coproduct, (:+:), allowed arbitrary grouping of signatures,
e.g. (f :+: g) :+: g. However, injection only performed a linear search from
left to right, and so the subtyping relation could not be satisfied in cases
which were not right-associative, such as f :≺: (f :+: g) :+: h.
To remedy this, a list of signatures can be used instead.
For example, f :+: g would be written as Sig [f, g], using Sig below.
Here, fs is a list of the different signatures that make up the composite
signature Sig. The a is the type given to each f in fs.
(Side note: An alternative method to solve injection into non-right-associative signatures involves using a backtracking search, but using a list provides a simpler solution)
data Sig : (fs : List (Type -> Type)) -> (a : Type) -> Type where
Here : f a -> Sig (f :: fs) a
There : Sig fs a -> Sig (f :: fs) aInstead of InL and InR constructors for the coproduct, Sig uses Here
and There to describe how to construct a value of a signature. For example,
the DSL from Data Types a la Carte paper can be described as:
data Val k = V Int
data Add k = A k kUsing the standard definition of Fix, and the Here and There constructors,
DSLs containing both values and addition can be created. Below represents the
expression 1 + 2:
ex1 : Fix (Sig [Val, Add])
ex1 = In (There (Here (A x y))) where
x = In (Here (V 1))
y = In (Here (V 2))Clearly, this is very cumbersome to write, and so, like in the original paper,
it would be better to automate injection into some super-type. This will involve
finding the correct combination of There and Here.
Attempting to use interface instances, like in the original paper, such as those below to perform this linear-search requires overlapping instances, which Idris does not allow.
Therefore, another solution to this search is required. Fortunately, this has
already been solved in the standard library (see Elem in Data.List),
with a structure similar to Elem below.
The constructor H is a proof the signature f is at the front of the list
of signatures to be composed together, fs. The constructor T is a proof the
signature is after the front of the list.
data Elem : (f : Type -> Type) -> (fs : List (Type -> Type)) -> Type where
H : Elem f (f :: fs)
T : Elem f ys -> Elem f (y :: ys)Using this, an inj function can be created to construct a sequence of Here
and There:
inj' : Elem f fs -> f a -> Sig fs a
inj' H x = Here x
inj' (T t) x = There (inj' t x)However, this does not currently provide any advantage over manually using
Here and There, since inj' requires manually specifying H and T.
However, Idris' automatic proof search can be leveraged to have it automatically
find the correct combination of H and T:
inj : {auto prf : Elem f fs} -> f a -> Sig fs a
inj {prf=H} x = Here x
inj {prf=T t} x = There (inj {prf=t} x)For convenience, inj can be adapted to work with Fix:
inject : {auto prf : Elem f fs} -> f (Fix (Sig fs)) -> Fix (Sig fs)
inject = In . injPutting this together, smart constructors can be created for the example DSL.
Instead of a type constraint, Val :<: f, such as in the original paper,
Elem Val fs is used to prove that Val exists in fs:
val : {auto prf : Elem Val fs} -> Int -> Fix (Sig fs)
val x = inject (V x)
add : {auto prf : Elem Add fs} -> Fix (Sig fs) -> Fix (Sig fs) -> Fix (Sig fs)
add x y = inject (A x y)Putting this all together, DSLs containing values and addition can be succinctly represented:
ex2 : Fix (Sig [Val, Add])
ex2 = add (val 1) (add (val 2) (val 3))In the original paper, typeclasses provide a method to specify the semantics
of different pieces of syntax separately, by providing an algebra f a -> a
to fold over a fix tree Fix f.
In order to fold over a Fix tree using cata, a f needs to be a functor:
cata : Functor f => (f a -> a) -> Fix f -> a
cata alg = alg . map (cata alg) . inopIn this implementation f is Sig fs, therefore, Sig fs needs to be a
functor. This could be done by modifying the definition of Here to be
Here : Functor f => f a -> Sig (f :: fs) a. However, in the implementation
so-far no functions have required f to be a functor and so this solution feels
sub-optimal.
Instead, the functor instance for Sig fs will be defined inductively such that
Sig fs will be a functor when each f in fs has its own definition of functor.
Firstly, where there are no signatures it is not possible to map over anything:
Functor (Sig []) where
map _ (Here _) impossible
map _ (There _) impossibleFor the base case, Here, map func x calls map : (a -> b) -> f a -> f b.
Whereas, in the recursive case, There, map func t calls
map : (a -> b) -> Sig fs a -> Sig fs b.
(Functor f, Functor (Sig fs)) => Functor (Sig (f :: fs)) where
map func (Here x) = Here (map func x)
map func (There t) = There (map func t)Continuing the example, by defining functor instances for both values and
addition, Sig [Val, Add] is also a functor.
Functor Val where
map f (V x) = V x
Functor Add where
map f (A x y) = A (f x) (f y)In order to define the algebra f a -> a for a signature f, an interface is
used to allow the algebra for different pieces of syntax to be defined separately.
interface Alg (f : Type -> Type) (a : Type) where
alg : f a -> aUsing the same technique used to define the Functor instance for Sig, the
Alg instance will be defined for Sig fs, provided each f in fs has its
own Alg instance. Again, when there is no syntax it is not possible to
use the algebras of any signatures:
Alg (Sig []) a where
alg (Here _) impossible
alg (There _) impossibleIn the base case, alg x uses alg : f a -> a. The recursive case uses
alg : Sig fs -> a.
(Alg f a, Alg (Sig fs) a) => Alg (Sig (f :: fs)) a where
alg (Here x) = alg x
alg (There x) = alg xTo continue the expression example, Alg instances can be created that
translate into a result integer:
Alg Val Int where
alg (V x) = x
Alg Add Int where
alg (A x y) = x + yGiven signatures which have an algebra to which converts their syntax to
integers, a calc function can be created which calculates the value of an
expression:
calc : (Functor (Sig fs), Alg (Sig fs) Int) => Fix (Sig fs) -> Int
calc = cata algThis demonstrates how Data Types a la Carte can be implemented in such as
way as to solve one of the original problems. This shows how the inability to
create overlapping instances is solved by using the Elem type, and how
automatic proof search can be used to automatically inject into a type.
A problem inherent with Data Types a la Carte is it inability to allow
different semantic domains to be easily composed. To extend the DSL presented
with exceptions, for example, would require reimplementing the interface
instances Alg Val Int and Alg Add Int to be Alg Val (Either Err Int) and
Alg Add (Either Err Int). One solution to this is Effect Handlers
which handle only specific parts of the syntax composed, instead of all the
syntax at once. Fortunately, this has already been implemented in Idris.