Bound

5,279 views
5,149 views

Published on

This talk covers a novel approach to "name binding" in syntax trees for programming languages that makes it much easier to write compilers and interpreters with a higher degree of assurance.

Published in: Technology

Bound

  1. 1. Making de Bruijn Succ Less Edward Kmett
  2. 2.  We use names in lots of contexts, but any program that deals with names has to deal with a number of issues such as  capture avoidance  deciding alpha equivalence … and others that will come up as we go.
  3. 3. The dumbest thing that could possibly work:type Name = Stringdata Exp = Var Name | Exp :@ Exp | Lam Name ExpVar “x”Lam “x” (Var “x”)Lam “x” (Lam “y” (Var “x”))
  4. 4. Blindly Substituting Lam “x” (Var “y”) intoLam “y”( Var “z”)for “z” would yieldLam “y” (Lam “x” (Var “y”))which now causes the free variable to referencethe “y” bound by the outer lambda.
  5. 5. Lam “x” (Var “x”)and Lam “y” (Var “y”)both mean the same thing and it’d be nice to beable to check this easily, make them hash thesame way for CSE, etc.
  6. 6. There is a cottage industry of solutions ot the namingproblem. Naïve substitution Barendregt Convention HOAS Weak HOAS / PHOAS “I am not a Number: I am a Free Variable!” Locally Nameless Syntax with de Bruijn Indices Unbound, mixing Barendregt and Locally Nameless. etc.I will not be addressing all of these here, just a few.
  7. 7. Just go look for names that avoid capture.Pros: Pretty syntax trees Easy to get started withCons: Easy even for experts to make mistakes! Alpha Equivalence checking is tedious. REALLY SLOW
  8. 8. subst :: Name -> Exp -> Exp -> Expsubst x s = sub where sub e@(Var v) | v == x = s | otherwise = e sub e@(Lam v e) | v == x = e | v `elem` fvs = Lam v (sub e’) | otherwise = Lam v (sub e’) where v = newId vs e = subst v (Var v) e’ sub (f :@ a) = sub f :@ sub a fvs = freeVars s vs = fvs `union` allVars bnewId :: [Name] -> NamenewId vs = head (someEnormousPoolOfNames vs)– go find a name that isn’t taken!(based on code by Lennart Augustsson)
  9. 9. Make sure that every binder binds a globally unique name.Pros: “Secrets of the GHC Inliner” describes ‘the Rapier’ which can make this Fast.Cons: Easy even for experts to screw up Alpha Equivalence is tedious Need a globally unique variable supply (e.g. my concurrent-supply) The obvious implementation technique chews through a scarily large number of variable IDs.
  10. 10. Borrow substitution from the host language!data Exp a = Var a | Lam (Exp a -> Exp a) | Exp a :@ Exp a
  11. 11. Pros: Provides _really_ fast substitutionCons: Doesn’t work in theorem provers (Exp occurs in negative position) Hard to work under Binders! Exotic terms Alpha equivalence checking is tediousVariants such as Weak HOAS/PHOAS exist to address someof these issues at the expense of other problems.
  12. 12. M’colleague Bob Atkey once memorably described thecapacity to put up with de Bruijn indices as a Cylondetector, the kind of reverse Turing Test that the humans inBattlestar Galactica invent, the better to recognize oneanother by their common inadequacies. He had a point. —Conor McBride “I am not a number, I am a classy hack”
  13. 13. Split variables into Bound and Free.data Exp a = Free a | Bound !Int | Exp a :@ Exp a | Lam (Exp a)Bound variables reference the variable being bound by thelambda n lambdas out. Substitution has to renumber all thevariables.abstract :: Eq a => a -> Exp a -> Exp ainstantiate :: Exp a -> Exp a -> Exp a
  14. 14. Split variables into Bound and Free.newtype Scope f a = Scope (f a)data Exp a = Free a | Bound !Int | Exp a :@ Exp a | Lam (Scope Exp a)Bound variables reference the variable being bound by thelambda n lambdas out. Substitution has to renumber all thevariables.abstract :: Eq a => a -> Exp a -> Scope Exp ainstantiate :: Exp a -> Scope Exp a -> Exp a
  15. 15. abstract :: Eq a => a -> Exp a -> Scope Exp aabstract me expr = Scope (letmeB 0 expr) where letmeB this (F you) | you == me = B this | otherwise = F you letmeB this (B that) = B that letmeB this (fun :@ arg) = letmeB this fun :@ letmeB this arg letmeB this (Lam (Scope body)) = Lam (Scope (letmeB (succ this) body))(Based on code by Conor McBride from “I am not a number: I am a free variable”)
  16. 16. instantiate :: Exp a -> Scope Exp a -> Exp ainstantiate what (Scope body) = whatsB 0 bodywhere whatsB this (B that) | this==that = what | otherwise = B that whatsB this (F you) = F you whatsB this (fun :@ arg) = whatsB this fun :@ whatsB this arg whatsB this (Lam (Scope body)) = Lam (Scope (whatsB (succ this) body))(Based on code by Conor McBride from “I am not a number: I am a free variable”)
  17. 17. newtype Scope f a = Scope (f a)data Exp a = Free a | Bound !Int | Exp a :@ Exp a | Lam (Scope a) deriving (Functor, Foldable,Traversable)We can make an instance of Monad for Exp, but it is anawkward one-off experience.
  18. 18. Pros: Scope, abstract, and instantiate make it harderto screw up walking under binders. Alpha equivalence is just (==) We can make a Monad for Exp. We can use Traversable to find free variables,close terms, etc.Cons: This succ’s a lot. (Slow) Illegal terms such as Lam (Scope (Bound 2)) Have to define abstract/instantiate for eachtype. The Monad for Exp is a one-off deal.
  19. 19. data Exp a = Var a | Exp a :@ Exp a | Lam (Exp (Maybe a))(based on Bird and Paterson)
  20. 20. data Incr a = Z | S adata Exp a = Var a | Exp a :@ Exp a | Lam (Exp (Incr a))(based on Bird and Paterson)
  21. 21. data Incr a = Z | S anewtype Scope f a = Scope (f (Incr a))data Exp a = Var a | Exp a :@ Exp a | Lam (Scope Exp a)instance MonadTrans Scope where lift = Scope . fmap Just-- Scope is just MaybeT a Monad transformer in its ownright, but lift is slow.
  22. 22. instance Monad Exp where Var a >>= f = f a x :@ y >>= f = (x >>= f) :@ (y >>= f) Lam b >>= f = Lam (b >>= lift . f)You can derive Foldable and Traversable.Then Data.Foldable.toList can obtain the freevariables in a term, and (>>=) does captureavoiding substitution!
  23. 23. Pros: The Monad is easy to define Foldable/Traversable for free variables Capture avoiding substitution for freeCons: It still succs a lot. lift is O(n).
  24. 24. If we could succ an entire expression instead of on eachindividual variable we would succ less.Instantiation wouldn’t have to walk into that expressionat all, and we could lift an Exp into Scope in O(1) insteadof O(n).This requires polymorphic recursion, but we supportthat. Go Haskell!This is the ‘generalized de Bruijn’ as described by Birdand Paterson without the rank-2 types mucking up thedescription and abstracted into a monad transformer.
  25. 25. data Incr a = Z | S anewtype Scope f a = Scope { unscope :: f (Incr (f a) }instance Monad f => Monad (Scope f) where return = Scope . return . S . return Scope e >>= f = Scope $ e >>= v -> case v of Z -> return Z S ea -> ea >>= unscope . finstance MonadTrans Scope where lift = Scope . return . S
  26. 26. Pros: The Monad is easy to define Foldable/Traversable for Free Variables Capture avoiding substitution for freeCons: Alpha equivalence is slightly harder,because you have to quotient out the positionof the ‘Succ’s.
  27. 27. abstract :: (Monad f, Eq a) => a -> f a -> Scope f aabstract x e = Scope (liftM k e) where ky | x == y =Z | otherwise = S (return y)instantiate :: Monad f => f a -> Scope f a -> f ainstantiate r (Scope e) = e >>= v -> case v of Z -> r Sa -> aWe can define these operations once and for all, independentof our expression type!
  28. 28. Not every language is the untyped lambdacalculus. Sometimes you want to bind multiplevariables at the same time, say for a pattern orrecursive let binding, or to represent all thevariables boundby a single quantifier in a singlepass.So lets go back and enrich our binders so theyan bind multiple variables by generalizinggeneralized de Bruijn.
  29. 29. data Var b a = B b | F adata Scope b f a = Scope { unscope :: f (Var b (f a) }instance Monad f => Monad (Scope b f)instance MonadTrans (Scope b)abstract :: Monad f => (a -> Maybe b) -> f a -> Scope b f ainstantiate :: Monad f => (b -> f a) -> Scope b f a -> f afromScope :: Monad f => Scope b f a -> f (Var b a)toScope :: Monad f => f (Var b a) -> Scope b f asubstitute :: (Monad f, Eq a) => a -> f a -> f a -> f aclass Bound t where (>>>=) :: Monad m => t m a -> (a -> m b) -> a -> t m binstance Bound (Scope b)
  30. 30. data Exp a =V a | Exp a :@ Exp a | Lam (Scope () Exp a) | Let [Scope Int Exp a] (Scope Int Exp a) deriving (Eq,Ord,Show,Read,Functor,Foldable,Traversable)Instance Monad Exp where Va >>= f = f a (x :@ y) >>= f = (x >>= f) :@ (y >>= f) Lam e >>= f = Lam (e >>>= f) Let bs b >>= f = Let (map (>>>= f) bs) (b >>>= f)
  31. 31. abstract1 :: (Monad f, Eq a) => a -> f a -> Scope () f aabstract :: Monad f => (a -> Maybe b) -> f a -> Scope b f alam :: Eq a => a -> Exp a -> Exp alam v b = Lam (abstract1 v b)let_ :: Eq a => [(a,Exp a)] -> Exp a -> Exp alet_ bs b = Let (map (abstr . snd) bs) (abstr b) where abstr = abstract (`elemIndex` map fst bs)infixr 0 !(!) :: Eq a => a -> Exp a -> Exp a(!) = lam
  32. 32. instantiate :: Monad f => (b -> f a) -> Scope b f a -> f ainstantiate1 :: Monad f => f a -> Scope () f a -> f awhnf :: Exp a -> Exp awhnf e@V{} = ewhnf e@Lam{} = ewhnf (f :@ a) = case whnf f of Lam b -> whnf (instantiate1 a b) f -> f :@ awhnf (Let bs b) = whnf (inst b) where es = map inst bs inst = instantiate (es !!)
  33. 33. fromScope :: Monad f => Scope b f a -> f (Var b a)toScope :: Monad f => f (Var b a) -> Scope b f anf :: Exp a -> Exp anf e@V{} = enf (Lam b) = Lam $ toScope $ nf $ fromScope bnf (f :@ a) = case whnf f of Lam b -> nf (instantiate1 a b) f -> nf f :@ nf anf (Let bs b) = nf (inst b) where es = map inst bs inst = instantiate (es !!)
  34. 34. closed :: Traversable f => f a -> Maybe (f b)closed = traverse (const Nothing)A closed term has no free variables, so you canTreat the free variable type as anything youwant.
  35. 35. cooked :: Exp acooked = fromJust $ closed $ let_ [ ("False", "f" ! "t" ! V"f") , ("True", "f" ! "t" ! V"t") , ("if", "b" ! "t" ! "f" ! V"b" :@ V"f" :@ V"t") , ("Zero", "z" ! "s" ! V"z") , ("Succ", "n" ! "z" ! "s" ! V"s" :@ V"n") , ("one", V"Succ" :@ V"Zero") , ("two", V"Succ" :@ V"one") , ("three", V"Succ" :@ V"two") , ("isZero", "n" ! V"n" :@ V"True" :@ ("m" ! V"False")) , ("const", "x" ! "y" ! V"x") , ("Pair", "a" ! "b" ! "p" ! V"p" :@ V"a" :@ V"b") , ("fst", "ab" ! V"ab" :@ ("a" ! "b" ! V"a")) , ("snd", "ab" ! V"ab" :@ ("a" ! "b" ! V"b")) , ("add", "x" ! "y" ! V"x" :@ V"y" :@ ("n" ! V"Succ" :@ (V"add" :@ V"n" :@ V"y"))) , ("mul", "x" ! "y" ! V"x" :@ V"Zero" :@ ("n" ! V"add" :@ V"y" :@ (V"mul" :@ V"n" :@ V"y"))) , ("fac", "x" ! V"x" :@ V"one" :@ ("n" ! V"mul" :@ V"x" :@ (V"fac" :@ V"n"))) , ("eqnat", "x" ! "y" ! V"x" :@ (V"y" :@ V"True" :@ (V"const" :@ V"False")) :@ ("x1" ! V"y" :@ V"False" :@ ("y1" ! V"eqnat" :@ V"x1" :@V"y1"))) , ("sumto", "x" ! V"x" :@ V"Zero" :@ ("n" ! V"add" :@ V"x" :@ (V"sumto" :@ V"n"))) , ("n5", V"add" :@ V"two" :@ V"three") , ("n6", V"add" :@ V"three" :@ V"three") , ("n17", V"add" :@ V"n6" :@ (V"add" :@ V"n6" :@ V"n5")) , ("n37", V"Succ" :@ (V"mul" :@ V"n6" :@ V"n6")) , ("n703", V"sumto" :@ V"n37") , ("n720", V"fac" :@ V"n6") ] (V"eqnat" :@ V"n720" :@ (V"add" :@ V"n703" :@ V"n17"))
  36. 36. ghci> nf cooked == (“F” ! “T” ! “T”)> True
  37. 37. data Exp a =V a | Exp a :@ Exp a | Lam !Int (Pat Exp a) (Scope Int Exp a) | Let !Int [Scope Int Exp a] (Scope Int Exp a) | Case (Exp a) [Alt Exp a] deriving (Eq,Ord,Show,Read,Functor,Foldable,Traversable)data Pat f a = VarP | WildP | AsP (Pat f a) | ConP String [Pat f a] | ViewP (Scope Int f a) (Pat f a) deriving (Eq,Ord,Show,Read,Functor,Foldable,Traversable)data Alt f a = Alt !Int (Pat f a) (Scope Int f a)deriving (Eq,Ord,Show,Read,Functor,Foldable,Traversable)
  38. 38. instance Monad Exp where return = V Va >>= f = f a (x :@ y) >>= f = (x >>= f) :@ (y >>= f) Lam n p e >>= f = Lam n (p >>>= f) (e >>>= f) Let n bs e >>= f = Let n (map (>>>= f) bs) (e >>>= f) Case e as >>= f = Case (e >>= f) (map (>>>= f) as)instance Bound Pat where VarP >>>= _ = VarP WildP >>>= _ = WildP AsP p >>>= f = AsP (p >>>= f) ConP g ps >>>= f = ConP g (map (>>>= f) ps) ViewP e p >>>= f = ViewP (e >>>= f) (p >>>= f)instance Bound Alt where Alt n p b >>>= f = Alt n (p >>>= f) (b >>>= f)
  39. 39. data P a = P { pattern :: [a] -> Pat Exp a, bindings :: [a] }varp :: a -> P avarp a = P (const VarP) [a]wildp :: P awildp = P (const WildP) []conp :: String -> [P a] -> P aconp g ps = P (ConP g . go ps) (ps >>= bindings) where go (P p as:ps) bs = p bs : go ps (bs ++ as) go [] _ = []lam :: Eq a => P a -> Exp a -> Exp alam (P p as) t = Lam (length as) (p []) (abstract (`elemIndex` as) t)ghci> lam (varp "x") (V "x”)Lam 1 VarP (Scope (V (B 0)))ghci> lam (conp "Hello" [varp "x", wildp]) (V "y”)Lam 1 (ConP "Hello" [VarP,WildP]) (Scope (V (F (V "y"))))
  40. 40. Deriving Eq, Ord, Show and Read requires some tomfoolery. The issue isthat Scope uses polymorphic recursion.So the most direct way of implementing Eq (Scope b f a) would requireInstance (Eq (f (Var b (f a)), Eq (Var b (f a), Eq (f a), Eq a) => Eq (Scope b f a)And then Exp would require:instance (Eq a, Eq (Pat Exp a), Eq (Scope Int Exp a), Eq (AltExp a)) => Eq (Exp a)Plus all the things required by Alt, Pat, and Scope!Moreover, these would require flexible contexts, taking us out of Haskell98/2010.Blech!
  41. 41. My prelude-extras package defines a number of boring typeclasses like:class Eq1 f where (==#) :: Eq a => f a -> f a -> Bool (/=#) :: Eq a => f a -> f a -> Boolclass Eq1 f => Ord1 f where compare1 :: Ord a => f a -> f a -> Orderingclass Show1 f where showsPrec1 :: Show a => Int -> f a -> ShowSclass Read1 f where readsPrec1 :: Read a => Int -> ReadS (f a) readList1 :: Read a => ReadS [f a]
  42. 42. Bound defines:instance (Functor f, Show b, Show1 f, Show a) => Show (Scope b f a)instance (Functor f, Read b, Read1 f, Read a) => Read (Scope b f a)instance (Monad f, Ord b, Ord1 f, Ord a) => Ord (Scope b f a)instance (Monad f, Eq b, Eq1 f, Eq a) => Eq (Scope b f a)So you just need to defineinstance Eq1 Exp where (==#) = (==)instance Ord1 Exp where compare1 = compareinstance Show1 Exp where showsPrec1 = showsPrecinstance Read1 Exp where readsPrec1 = readsPrecWhy do some use Monad? Ord and Eq perform a non-structural equalitycomparison so that (==) is alpha-equality!
  43. 43. We can define languages that have strongly typed variabes bymoving to much scarier types. =)type Nat f g = forall x. f x -> g xclass HFunctor t where hmap :: Nat f g -> Nat (t f) (t g)class HFunctor t => HTraversable t where htraverse :: Applicative m => (forall x. f x -> m (g x)) -> t f a -> m (t ga)class HFunctor t => HMonad t where hreturn :: Nat f (t f) (>>-) :: t f a -> Nat f (t g) -> t g a
  44. 44. data Equal a b where Refl :: Equal a aclass EqF f where (==?) :: f a -> f b -> Maybe (Equal a b)data Var b f a where B :: b a -> Var b f a F :: f a -> Var b f anewtype Scope b t f a = Scope { unscope :: t (Var b (t f)) a }abstract :: HMonad t => (forall x. f x -> Maybe (b x)) -> Nat (t f) (Scope b t f)instantiate :: HMonad t => Nat b (t f) -> Nat (Scope b t f) (t f)class HBound s where (>>>-) :: HMonad t => s t f a -> Nat f (t g) -> s t g a
  45. 45. Dependently typed languages build up a lot ofcrap in memory. It’d be nice to share memoryfor it, since most of it is very repetitive.
  46. 46.  Bound provides a small API for dealing with abstraction/instantiation for complex binders that combines the nice parts of “I am not a number: I am a free variable” with the “de Bruijn notation as a nested data type” while avoiding the complexities of either. You just supply it a Monad and Traversable No variable supply is needed, no pool of names Substitution is very efficient Introduces no exotic or illegal terms Simultaneous substitution for complex binders Your code never sees a de Bruijn index
  47. 47. data Ix :: [*] -> * -> * where Z :: Ix (a : as) a S :: Ix as b -> Ix (a : as) bdata Vec :: (* -> *) -> [*] -> * where HNil :: Vec f [] (:::) :: f b -> Vec f bs -> Vec f (b : bs)data Lit t where Integer :: Integer -> Lit Integer Double :: Double -> Lit Double String :: String -> Lit Stringdata Remote :: (* -> *) -> * -> * where Var :: f a -> Remote f a Lit :: Lit a -> Remote f a Lam :: Scope (Equal b) Remote f a -> Remote f (b -> a) Let :: Vec (Scope (Ix bs) Remote f) bs -> Scope (Ix bs) Remote f a -> Remote f a Ap :: Remote f (a -> b) -> Remote f a -> Remote f b
  48. 48. lam_ :: EqF f => f a -> Remote f b -> Remote f (a -> b)lam_ v f = Lam (abstract (v ==?) f)-- let_ actually winds up becoming much trickier to define-- requiring a MonadFix and a helper monad.two12121212 = let_ $ mdo x <- def (cons 1 z) z <- def (cons 2 x) return z

×