Compare commits

..

2 commits

Author SHA1 Message Date
e0b357450c support directly binding functions in let 2024-11-23 10:35:58 -08:00
780ab52809 added let expressions 2024-11-23 09:16:32 -08:00
7 changed files with 187 additions and 31 deletions

View file

@ -15,7 +15,29 @@ To be perfectly clear, `λ` abstractions can be written with either “λ&rd
`Π` types can be written with either “Π”, “”, or “forall”, and are separated from their bodies with a “,”. Arrow types can be written “->” or “→”. Like with `λ` abstractions, binders with the same type can be grouped, and multiple binders can occur between the “Π” and the “,”.
Definitions work similarly, having abstract syntax as shown below.
`Let` expressions have syntax shown below.
let ( (<ident> := <expr>) )+ in <expr> end
Below is a more concrete example.
let (a := (and_elim_l A (and B C) h))
(bc := (and_elim_r A (and B C) h))
(b := (and_elim_l B C bc))
(c := (and_elim_r B C bc))
in
and_intro (and A B) C (and_intro A B a b) c
end
You can also directly bind functions. Here&rsquo;s an example.
let (f (A : *) (x : A) := x) in
f x
end
The syntax for binding functions is just like with definitions.
Definitions have abstract syntax as shown below.
<ident> (<ident> : <type>)* : <type>? := <term> | axiom;
@ -129,14 +151,26 @@ This means our proofs were accepted. Furthermore, as a sanity check, we can see
## Substantive
### TODO Let-expressions
### DONE Let-expressions
I vaguely remember from reading Coq&rsquo;art that Coq does something special with `let` expressions, so I&rsquo;ll maybe want to check that out. I tried implementing `let` as syntax sugar for an immediately called function, but that proved to be a massive mess with how I&rsquo;m handling things. `let` expressions would definitely be handy for factoring out complex sub expressions.
I decided to bake `let` expressions into the formal language, rather than being a layer on top of the syntax. This means we can treat typing `let` expressions different from functions. The only substantial difference between
let (x := value) in body end
and
(fun (x : type) => body) value
is that the latter must have a proper function type while the former doesn&rsquo;t need to. So, for instance,
let (x := *) in ...
is possible, while you could not write a function whose argument is type `□`. We are justified in doing this because we always know what the argument is, and can just immediately substitute it, both into the value of the body, as well as the type of the body.
### TODO Sections
Coq-style sections would be very handy, and probably relatively easy to implement. Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify `binders`, or just make every definition require every section variable as an argument.
Coq-style sections would be very handy, and probably *relatively* easy to implement (compared to everything else on this todo list). Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify `binders`, or just make every definition require every section variable as an argument.
### TODO Inference
@ -146,12 +180,12 @@ Not decidable, but I might be able to implement some basic unification algorithm
### TODO Implicits
Much, much more useful than [inference](#orgc6eb0b1), implicit arguments would be amazing. It also seems a lot more complicated, but any system for dealing with implicit arguments is far better than none.
Much, much more useful than [inference](#orgb7d6c6f), implicit arguments would be amazing. It also seems a lot more complicated, but any system for dealing with implicit arguments is far better than none.
### TODO Module System
A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of `and`&rsquo;s like I currently have; especially painful without [implicits](#orged32318)) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity.
A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of `and`&rsquo;s like I currently have; especially painful without [implicits](#orgc3b40df)) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity. The [F-ing modules](https://www.cambridge.org/core/services/aop-cambridge-core/content/view/B573FA00832D55D4878863DE1725D90B/S0956796814000264a.pdf/f-ing-modules.pdf) paper is probably a good reference.
### TODO Universes?
@ -197,6 +231,11 @@ Probably a smart idea.
Right now, if there&rsquo;s a failure, everything just stops immediately. More incremental parsing/typechecking could pave the way for more interactivity, e.g. development with holes, an LSP server, etc., not to mention better error messages.
### TODO Multiple levels of AST
Right now, the parsing and typechecking kind of happens all at once on a single syntax representation. As I add fancier and fancier elaboration, it might be a good idea to have multiple syntax representations. So we&rsquo;d have one level of syntax representing what is actually in the file (modulo some easy elaboration like with function definitions), and through a series of transformations transform it into something like the current `Expr` type with all the information needed for typechecking and all the complex language features removed.
### TODO Alternate syntax
I&rsquo;ve had a bunch of ideas for a more mathematician-friendly syntax bouncing around my head for a while. Implementing one of them would be awesome, but probably quite tricky.

View file

@ -16,7 +16,29 @@ To be perfectly clear, =λ= abstractions can be written with either "λ" or "fun
=Π= types can be written with either "Π", "∀", or "forall", and are separated from their bodies with a ",". Arrow types can be written "->" or "\to". Like with =λ= abstractions, binders with the same type can be grouped, and multiple binders can occur between the "Π" and the ",".
Definitions work similarly, having abstract syntax as shown below.
=Let= expressions have syntax shown below.
#+begin_src
let ( (<ident> := <expr>) )+ in <expr> end
#+end_src
Below is a more concrete example.
#+begin_src
let (a := (and_elim_l A (and B C) h))
(bc := (and_elim_r A (and B C) h))
(b := (and_elim_l B C bc))
(c := (and_elim_r B C bc))
in
and_intro (and A B) C (and_intro A B a b) c
end
#+end_src
You can also directly bind functions. Here's an example.
#+begin_src
let (f (A : *) (x : A) := x) in
f x
end
#+end_src
The syntax for binding functions is just like with definitions.
Definitions have abstract syntax as shown below.
#+begin_src
<ident> (<ident> : <type>)* : <type>? := <term> | axiom;
#+end_src
@ -122,11 +144,23 @@ This means our proofs were accepted. Furthermore, as a sanity check, we can see
** Substantive
*** TODO Let-expressions
I vaguely remember from reading Coq'art that Coq does something special with =let= expressions, so I'll maybe want to check that out. I tried implementing =let= as syntax sugar for an immediately called function, but that proved to be a massive mess with how I'm handling things. =let= expressions would definitely be handy for factoring out complex sub expressions.
*** DONE Let-expressions
I decided to bake =let= expressions into the formal language, rather than being a layer on top of the syntax. This means we can treat typing =let= expressions different from functions. The only substantial difference between
#+begin_src
let (x := value) in body end
#+end_src
and
#+begin_src
(fun (x : type) => body) value
#+end_src
is that the latter must have a proper function type while the former doesn't need to. So, for instance,
#+begin_src
let (x := *) in ...
#+end_src
is possible, while you could not write a function whose argument is type =□=. We are justified in doing this because we always know what the argument is, and can just immediately substitute it, both into the value of the body, as well as the type of the body.
*** TODO Sections
Coq-style sections would be very handy, and probably relatively easy to implement. Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify =binders=, or just make every definition require every section variable as an argument.
Coq-style sections would be very handy, and probably /relatively/ easy to implement (compared to everything else on this todo list). Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify =binders=, or just make every definition require every section variable as an argument.
*** TODO Inference
Not decidable, but I might be able to implement some basic unification algorithm, or switch to bidirectional type checking. This isn't super necessary though, I find leaving off the types of arguments to generally be a bad idea, but in some cases it can be handy, especially not at the top level.
@ -135,7 +169,7 @@ Not decidable, but I might be able to implement some basic unification algorithm
Much, much more useful than [[Inference][inference]], implicit arguments would be amazing. It also seems a lot more complicated, but any system for dealing with implicit arguments is far better than none.
*** TODO Module System
A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of =and='s like I currently have; especially painful without [[Implicits][implicits]]) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity.
A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of =and='s like I currently have; especially painful without [[Implicits][implicits]]) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity. The [[https://www.cambridge.org/core/services/aop-cambridge-core/content/view/B573FA00832D55D4878863DE1725D90B/S0956796814000264a.pdf/f-ing-modules.pdf][F-ing modules]] paper is probably a good reference.
*** TODO Universes?
Not super necessary, but occasionally extremely useful. Could be fun, idk.
@ -163,6 +197,9 @@ Probably a smart idea.
*** TODO More incremental parsing/typechecking
Right now, if there's a failure, everything just stops immediately. More incremental parsing/typechecking could pave the way for more interactivity, e.g. development with holes, an LSP server, etc., not to mention better error messages.
*** TODO Multiple levels of AST
Right now, the parsing and typechecking kind of happens all at once on a single syntax representation. As I add fancier and fancier elaboration, it might be a good idea to have multiple syntax representations. So we'd have one level of syntax representing what is actually in the file (modulo some easy elaboration like with function definitions), and through a series of transformations transform it into something like the current =Expr= type with all the information needed for typechecking and all the complex language features removed.
*** TODO Alternate syntax
I've had a bunch of ideas for a more mathematician-friendly syntax bouncing around my head for a while. Implementing one of them would be awesome, but probably quite tricky.

View file

@ -151,11 +151,13 @@ or_comm (A B : *) (h : or A B) : or B A :=
-- A ∧ (B ∧ C) => (A ∧ B) ∧ C
and_assoc_l (A B C : *) (h : and A (and B C)) : and (and A B) C :=
and_intro (and A B) C
(and_intro A B
(and_elim_l A (and B C) h)
(and_elim_l B C (and_elim_r A (and B C) h)))
(and_elim_r B C (and_elim_r A (and B C) h));
let (a := (and_elim_l A (and B C) h))
(bc := (and_elim_r A (and B C) h))
(b := (and_elim_l B C bc))
(c := (and_elim_r B C bc))
in
and_intro (and A B) C (and_intro A B a b) c
end;
-- (A ∧ B) ∧ C => A ∧ (B ∧ C)
and_assoc_r (A B C : *) (h : and (and A B) C) : and A (and B C) :=

View file

@ -49,6 +49,16 @@ findType g (Abs x a m) = do
findType g (Pi _ a b) = do
validateType_ g a
validateType (incIndices a : map incIndices g) b
findType g (Let _ v b) = do
a <- findType g v
validateType_ g a
res <- findType (incIndices a : map incIndices g) b
pure $ subst 0 a res
-- this is kinda goofy, it's just like a function, except the resulting type
-- of the body doesn't need to result in a valid function type
-- this means things like `let x := * in ...` would be allowed, even though
-- you couldn't write a function that takes something of type `□` as an argument
checkType :: Env -> Expr -> Result Expr
checkType env t = runReaderT (findType [] t) env

View file

@ -47,6 +47,7 @@ subst _ _ Square = Square
subst k s (App m n) = App (subst k s m) (subst k s n)
subst k s (Abs x m n) = Abs x (subst k s m) (subst (k + 1) (incIndices s) n)
subst k s (Pi x m n) = Pi x (subst k s m) (subst (k + 1) (incIndices s) n)
subst k s (Let x v b) = Let x (subst k s v) (subst (k + 1) (incIndices s) b)
envLookupVal :: Text -> ReaderT Env Result Expr
envLookupVal n = asks ((_val <$>) . M.lookup n) >>= maybe (throwError $ UnboundVariable n) pure
@ -54,7 +55,7 @@ envLookupVal n = asks ((_val <$>) . M.lookup n) >>= maybe (throwError $ UnboundV
envLookupTy :: Text -> ReaderT Env Result Expr
envLookupTy n = asks ((_ty <$>) . M.lookup n) >>= maybe (throwError $ UnboundVariable n) pure
-- reduce until β reducts are impossible in head position
-- reduce until β reducts or let simplifications are impossible in head position
whnf :: Expr -> ReaderT Env Result Expr
whnf (App (Abs _ _ v) n) = whnf $ subst 0 n v
whnf (App m n) = do
@ -63,6 +64,7 @@ whnf (App m n) = do
then pure $ App m n
else whnf $ App m' n
whnf (Free n) = envLookupVal n >>= whnf
whnf (Let _ v b) = whnf $ subst 0 v b
whnf e = pure e
reduce :: Expr -> ReaderT Env Result Expr
@ -71,6 +73,7 @@ reduce (App m n) = App <$> reduce m <*> reduce n
reduce (Abs x t v) = Abs x <$> reduce t <*> reduce v
reduce (Pi x t v) = Pi x <$> reduce t <*> reduce v
reduce (Free n) = envLookupVal n
reduce (Let _ v b) = pure $ subst 0 v b
reduce e = pure e
normalize :: Expr -> ReaderT Env Result Expr

View file

@ -9,6 +9,7 @@ data Expr where
App :: Expr -> Expr -> Expr
Abs :: Text -> Expr -> Expr -> Expr
Pi :: Text -> Expr -> Expr -> Expr
Let :: Text -> Expr -> Expr -> Expr
deriving (Show, Ord)
instance Eq Expr where
@ -20,6 +21,7 @@ instance Eq Expr where
(App e1 e2) == (App f1 f2) = e1 == f1 && e2 == f2
(Abs _ t1 b1) == (Abs _ t2 b2) = t1 == t2 && b1 == b2
(Pi _ t1 b1) == (Pi _ t2 b2) = t1 == t2 && b1 == b2
(Let _ v1 b1) == (Let _ v2 b2) = v1 == v2 && b1 == b2
_ == _ = False
occursFree :: Integer -> Expr -> Bool
@ -31,6 +33,7 @@ occursFree _ Square = False
occursFree n (App a b) = on (||) (occursFree n) a b
occursFree n (Abs _ a b) = occursFree n a || occursFree (n + 1) b
occursFree n (Pi _ a b) = occursFree n a || occursFree (n + 1) b
occursFree n (Let _ v b) = occursFree n v || occursFree (n + 1) b
shiftIndices :: Integer -> Integer -> Expr -> Expr
shiftIndices d c (Var x k)
@ -43,6 +46,7 @@ shiftIndices _ _ Square = Square
shiftIndices d c (App m n) = App (shiftIndices d c m) (shiftIndices d c n)
shiftIndices d c (Abs x m n) = Abs x (shiftIndices d c m) (shiftIndices d (c + 1) n)
shiftIndices d c (Pi x m n) = Pi x (shiftIndices d c m) (shiftIndices d (c + 1) n)
shiftIndices d c (Let x v b) = Let x (shiftIndices d c v) (shiftIndices d (c + 1) b)
incIndices :: Expr -> Expr
incIndices = shiftIndices 1 0
@ -52,12 +56,31 @@ incIndices = shiftIndices 1 0
parenthesize :: Text -> Text
parenthesize s = "(" <> s <> ")"
collectLambdas :: Expr -> ([(Text, Expr)], Expr)
type Param = (Text, Expr)
type ParamGroup = ([Text], Expr)
type Binding = (Text, [ParamGroup], Expr)
collectLambdas :: Expr -> ([Param], Expr)
collectLambdas (Abs x ty body) = ((x, ty) : params, final)
where
(params, final) = collectLambdas body
collectLambdas e = ([], e)
collectLets :: Expr -> ([Binding], Expr)
collectLets (Let x val body) = ((x, params', val') : bindings, final)
where
(bindings, final) = collectLets body
(params, val') = collectLambdas val
params' = groupParams params
collectLets e = ([], e)
collectPis :: Expr -> ([Param], Expr)
collectPis p@(Pi "" _ _) = ([], p)
collectPis (Pi x ty body) = ((x, ty) : params, final)
where
(params, final) = collectPis body
collectPis e = ([], e)
cleanNames :: Expr -> Expr
cleanNames (App m n) = App (cleanNames m) (cleanNames n)
cleanNames (Abs x ty body) = Abs x (cleanNames ty) (cleanNames body)
@ -66,14 +89,7 @@ cleanNames (Pi x ty body)
| otherwise = Pi "" ty (cleanNames body)
cleanNames e = e
collectPis :: Expr -> ([(Text, Expr)], Expr)
collectPis p@(Pi "" _ _) = ([], p)
collectPis (Pi x ty body) = ((x, ty) : params, final)
where
(params, final) = collectPis body
collectPis e = ([], e)
groupParams :: [(Text, Expr)] -> [([Text], Expr)]
groupParams :: [Param] -> [ParamGroup]
groupParams = foldr addParam []
where
addParam :: (Text, Expr) -> [([Text], Expr)] -> [([Text], Expr)]
@ -82,9 +98,18 @@ groupParams = foldr addParam []
| incIndices t == s = (x : xs, t) : rest
| otherwise = ([x], t) : l
showParamGroup :: ([Text], Expr) -> Text
showParamGroup :: ParamGroup -> Text
showParamGroup (ids, ty) = parenthesize $ unwords ids <> " : " <> pretty ty
showBinding :: Binding -> Text
showBinding (ident, params, val) =
parenthesize $
ident
<> " "
<> unwords (map showParamGroup params)
<> " := "
<> pretty val
helper :: Integer -> Expr -> Text
helper _ (Var s _) = s
helper _ (Free s) = s
@ -107,6 +132,10 @@ helper k e@(Abs{}) = if k >= 1 then parenthesize res else res
(params, body) = collectLambdas e
grouped = showParamGroup <$> groupParams params
res = "λ " <> unwords grouped <> " . " <> pretty body
helper _ e@(Let{}) = res
where
(binds, body) = collectLets e
res = "let " <> unwords (map showBinding binds) <> " in " <> pretty body <> " end"
pretty :: Expr -> Text
pretty = helper 0 . cleanNames

View file

@ -49,11 +49,18 @@ lexeme = L.lexeme skipSpace
eat :: Text -> Parser ()
eat = void . lexeme . chunk
keywords :: [Text]
keywords = ["forall", "let", "in", "end", "fun"]
pIdentifier :: Parser Text
pIdentifier = label "identifier" $ lexeme $ do
pIdentifier = try $ label "identifier" $ lexeme $ do
firstChar <- letterChar <|> char '_'
rest <- many $ alphaNumChar <|> char '_'
return $ T.pack (firstChar : rest)
let ident = T.pack (firstChar : rest)
when (ident `elem` keywords) $
fail $
"Reserved word: " ++ T.unpack ident
pure ident
pVar :: Parser Expr
pVar = label "variable" $ lexeme $ do
@ -97,10 +104,32 @@ pPAbs :: Parser Expr
pPAbs = lexeme $ label "Π-abstraction" $ withBinders $ do
_ <- defChoice $ "" :| ["forall", ""]
params <- pSomeParams
_ <- defChoice $ pure ","
eat ","
body <- pExpr
pure $ foldr (uncurry Pi) body params
pBinding :: Parser (Text, Expr)
pBinding = lexeme $ label "binding" $ do
env <- get
eat "("
ident <- pIdentifier
params <- pManyParams
eat ":="
value <- pExpr
eat ")"
put env
modify $ over bindsL (ident :)
pure (ident, foldr (uncurry Abs) value params)
pLet :: Parser Expr
pLet = lexeme $ label "let expression" $ withBinders $ do
eat "let"
bindings <- some pBinding
eat "in"
body <- try pExpr
eat "end"
pure $ foldr (uncurry Let) body bindings
pArrow :: Parser Expr
pArrow = lexeme $ label "->" $ do
a <- pAppTerm
@ -167,7 +196,14 @@ pTerm =
]
pAppTerm :: Parser Expr
pAppTerm = lexeme $ pLAbs <|> pPAbs <|> pApp
pAppTerm =
lexeme $
choice
[ pLAbs
, pPAbs
, pLet
, pApp
]
pExpr :: Parser Expr
pExpr = lexeme $ try pArrow <|> pAppTerm