diff --git a/README.md b/README.md index 253db1f..a10b6e2 100644 --- a/README.md +++ b/README.md @@ -15,6 +15,20 @@ To be perfectly clear, `λ` abstractions can be written with either “λ&rd `Π` types can be written with either “Π”, “∀”, or “forall”, and are separated from their bodies with a “,”. Arrow types can be written “->” or “→”. Like with `λ` abstractions, binders with the same type can be grouped, and multiple binders can occur between the “Π” and the “,”. +Finally, `let` expressions have syntax shown below. + + let ( ( := ) )+ in end + +Below is a more concrete example. + + let (a := (and_elim_l A (and B C) h)) + (bc := (and_elim_r A (and B C) h)) + (b := (and_elim_l B C bc)) + (c := (and_elim_r B C bc)) + in + and_intro (and A B) C (and_intro A B a b) c + end + Definitions work similarly, having abstract syntax as shown below. ( : )* : ? := | axiom; @@ -129,14 +143,26 @@ This means our proofs were accepted. Furthermore, as a sanity check, we can see ## Substantive -### TODO Let-expressions +### DONE Let-expressions -I vaguely remember from reading Coq’art that Coq does something special with `let` expressions, so I’ll maybe want to check that out. I tried implementing `let` as syntax sugar for an immediately called function, but that proved to be a massive mess with how I’m handling things. `let` expressions would definitely be handy for factoring out complex sub expressions. +I decided to bake `let` expressions into the formal language, rather than being a layer on top of the syntax. This means we can treat typing `let` expressions different from functions. The only substantial difference between + + let (x := value) in body end + +and + + (fun (x : type) => body) value + +is that the latter must have a proper function type while the former doesn’t need to. So, for instance, + + let (x := *) in ... + +is possible, while you could not write a function whose argument is type `□`. We are justified in doing this because we always know what the argument is, and can just immediately substitute it, both into the value of the body, as well as the type of the body. ### TODO Sections -Coq-style sections would be very handy, and probably relatively easy to implement. Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify `binders`, or just make every definition require every section variable as an argument. +Coq-style sections would be very handy, and probably *relatively* easy to implement (compared to everything else on this todo list). Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify `binders`, or just make every definition require every section variable as an argument. ### TODO Inference @@ -146,12 +172,12 @@ Not decidable, but I might be able to implement some basic unification algorithm ### TODO Implicits -Much, much more useful than [inference](#orgc6eb0b1), implicit arguments would be amazing. It also seems a lot more complicated, but any system for dealing with implicit arguments is far better than none. +Much, much more useful than [inference](#org31a76ab), implicit arguments would be amazing. It also seems a lot more complicated, but any system for dealing with implicit arguments is far better than none. ### TODO Module System -A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of `and`’s like I currently have; especially painful without [implicits](#orged32318)) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity. +A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of `and`’s like I currently have; especially painful without [implicits](#orge9a1bd4)) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity. The [F-ing modules](https://www.cambridge.org/core/services/aop-cambridge-core/content/view/B573FA00832D55D4878863DE1725D90B/S0956796814000264a.pdf/f-ing-modules.pdf) paper is probably a good reference. ### TODO Universes? @@ -197,6 +223,11 @@ Probably a smart idea. Right now, if there’s a failure, everything just stops immediately. More incremental parsing/typechecking could pave the way for more interactivity, e.g. development with holes, an LSP server, etc., not to mention better error messages. +### TODO Multiple levels of AST + +Right now, the parsing and typechecking kind of happens all at once on a single syntax representation. As I add fancier and fancier elaboration, it might be a good idea to have multiple syntax representations. So we’d have one level of syntax representing what is actually in the file (modulo some easy elaboration like with function definitions), and through a series of transformations transform it into something like the current `Expr` type with all the information needed for typechecking and all the complex language features removed. + + ### TODO Alternate syntax I’ve had a bunch of ideas for a more mathematician-friendly syntax bouncing around my head for a while. Implementing one of them would be awesome, but probably quite tricky. diff --git a/README.org b/README.org index 57fc873..b33858e 100644 --- a/README.org +++ b/README.org @@ -16,6 +16,21 @@ To be perfectly clear, =λ= abstractions can be written with either "λ" or "fun =Π= types can be written with either "Π", "∀", or "forall", and are separated from their bodies with a ",". Arrow types can be written "->" or "\to". Like with =λ= abstractions, binders with the same type can be grouped, and multiple binders can occur between the "Π" and the ",". +Finally, =let= expressions have syntax shown below. +#+begin_src + let ( ( := ) )+ in end +#+end_src +Below is a more concrete example. +#+begin_src + let (a := (and_elim_l A (and B C) h)) + (bc := (and_elim_r A (and B C) h)) + (b := (and_elim_l B C bc)) + (c := (and_elim_r B C bc)) + in + and_intro (and A B) C (and_intro A B a b) c + end +#+end_src + Definitions work similarly, having abstract syntax as shown below. #+begin_src ( : )* : ? := | axiom; @@ -122,11 +137,23 @@ This means our proofs were accepted. Furthermore, as a sanity check, we can see ** Substantive -*** TODO Let-expressions -I vaguely remember from reading Coq'art that Coq does something special with =let= expressions, so I'll maybe want to check that out. I tried implementing =let= as syntax sugar for an immediately called function, but that proved to be a massive mess with how I'm handling things. =let= expressions would definitely be handy for factoring out complex sub expressions. +*** DONE Let-expressions +I decided to bake =let= expressions into the formal language, rather than being a layer on top of the syntax. This means we can treat typing =let= expressions different from functions. The only substantial difference between +#+begin_src + let (x := value) in body end +#+end_src +and +#+begin_src + (fun (x : type) => body) value +#+end_src +is that the latter must have a proper function type while the former doesn't need to. So, for instance, +#+begin_src + let (x := *) in ... +#+end_src +is possible, while you could not write a function whose argument is type =□=. We are justified in doing this because we always know what the argument is, and can just immediately substitute it, both into the value of the body, as well as the type of the body. *** TODO Sections -Coq-style sections would be very handy, and probably relatively easy to implement. Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify =binders=, or just make every definition require every section variable as an argument. +Coq-style sections would be very handy, and probably /relatively/ easy to implement (compared to everything else on this todo list). Upon parsing a definition inside a section, will somehow need to look ahead to see what variables are used to see how I need to modify =binders=, or just make every definition require every section variable as an argument. *** TODO Inference Not decidable, but I might be able to implement some basic unification algorithm, or switch to bidirectional type checking. This isn't super necessary though, I find leaving off the types of arguments to generally be a bad idea, but in some cases it can be handy, especially not at the top level. @@ -135,7 +162,7 @@ Not decidable, but I might be able to implement some basic unification algorithm Much, much more useful than [[Inference][inference]], implicit arguments would be amazing. It also seems a lot more complicated, but any system for dealing with implicit arguments is far better than none. *** TODO Module System -A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of =and='s like I currently have; especially painful without [[Implicits][implicits]]) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity. +A proper module system would be wonderful. To me, ML style modules with structures, signatures, and functors seems like the right way to handle algebraic structures for a relatively simple language, rather than records (or, worse, a bunch of =and='s like I currently have; especially painful without [[Implicits][implicits]]) or type classes (probably much harder, but could be nicer), but any way of managing scope, importing files, etc. is a necessity. The [[https://www.cambridge.org/core/services/aop-cambridge-core/content/view/B573FA00832D55D4878863DE1725D90B/S0956796814000264a.pdf/f-ing-modules.pdf][F-ing modules]] paper is probably a good reference. *** TODO Universes? Not super necessary, but occasionally extremely useful. Could be fun, idk. @@ -163,6 +190,9 @@ Probably a smart idea. *** TODO More incremental parsing/typechecking Right now, if there's a failure, everything just stops immediately. More incremental parsing/typechecking could pave the way for more interactivity, e.g. development with holes, an LSP server, etc., not to mention better error messages. +*** TODO Multiple levels of AST +Right now, the parsing and typechecking kind of happens all at once on a single syntax representation. As I add fancier and fancier elaboration, it might be a good idea to have multiple syntax representations. So we'd have one level of syntax representing what is actually in the file (modulo some easy elaboration like with function definitions), and through a series of transformations transform it into something like the current =Expr= type with all the information needed for typechecking and all the complex language features removed. + *** TODO Alternate syntax I've had a bunch of ideas for a more mathematician-friendly syntax bouncing around my head for a while. Implementing one of them would be awesome, but probably quite tricky. diff --git a/examples/logic.pg b/examples/logic.pg index df715ce..bb15f35 100644 --- a/examples/logic.pg +++ b/examples/logic.pg @@ -151,11 +151,13 @@ or_comm (A B : *) (h : or A B) : or B A := -- A ∧ (B ∧ C) => (A ∧ B) ∧ C and_assoc_l (A B C : *) (h : and A (and B C)) : and (and A B) C := - and_intro (and A B) C - (and_intro A B - (and_elim_l A (and B C) h) - (and_elim_l B C (and_elim_r A (and B C) h))) - (and_elim_r B C (and_elim_r A (and B C) h)); + let (a := (and_elim_l A (and B C) h)) + (bc := (and_elim_r A (and B C) h)) + (b := (and_elim_l B C bc)) + (c := (and_elim_r B C bc)) + in + and_intro (and A B) C (and_intro A B a b) c + end; -- (A ∧ B) ∧ C => A ∧ (B ∧ C) and_assoc_r (A B C : *) (h : and (and A B) C) : and A (and B C) := diff --git a/lib/Check.hs b/lib/Check.hs index f451af8..184817f 100644 --- a/lib/Check.hs +++ b/lib/Check.hs @@ -49,6 +49,16 @@ findType g (Abs x a m) = do findType g (Pi _ a b) = do validateType_ g a validateType (incIndices a : map incIndices g) b +findType g (Let _ v b) = do + a <- findType g v + validateType_ g a + res <- findType (incIndices a : map incIndices g) b + pure $ subst 0 a res + +-- this is kinda goofy, it's just like a function, except the resulting type +-- of the body doesn't need to result in a valid function type +-- this means things like `let x := * in ...` would be allowed, even though +-- you couldn't write a function that takes something of type `□` as an argument checkType :: Env -> Expr -> Result Expr checkType env t = runReaderT (findType [] t) env diff --git a/lib/Eval.hs b/lib/Eval.hs index de6935e..babf2a9 100644 --- a/lib/Eval.hs +++ b/lib/Eval.hs @@ -47,6 +47,7 @@ subst _ _ Square = Square subst k s (App m n) = App (subst k s m) (subst k s n) subst k s (Abs x m n) = Abs x (subst k s m) (subst (k + 1) (incIndices s) n) subst k s (Pi x m n) = Pi x (subst k s m) (subst (k + 1) (incIndices s) n) +subst k s (Let x v b) = Let x (subst k s v) (subst (k + 1) (incIndices s) b) envLookupVal :: Text -> ReaderT Env Result Expr envLookupVal n = asks ((_val <$>) . M.lookup n) >>= maybe (throwError $ UnboundVariable n) pure @@ -54,7 +55,7 @@ envLookupVal n = asks ((_val <$>) . M.lookup n) >>= maybe (throwError $ UnboundV envLookupTy :: Text -> ReaderT Env Result Expr envLookupTy n = asks ((_ty <$>) . M.lookup n) >>= maybe (throwError $ UnboundVariable n) pure --- reduce until β reducts are impossible in head position +-- reduce until β reducts or let simplifications are impossible in head position whnf :: Expr -> ReaderT Env Result Expr whnf (App (Abs _ _ v) n) = whnf $ subst 0 n v whnf (App m n) = do @@ -63,6 +64,7 @@ whnf (App m n) = do then pure $ App m n else whnf $ App m' n whnf (Free n) = envLookupVal n >>= whnf +whnf (Let _ v b) = whnf $ subst 0 v b whnf e = pure e reduce :: Expr -> ReaderT Env Result Expr @@ -71,6 +73,7 @@ reduce (App m n) = App <$> reduce m <*> reduce n reduce (Abs x t v) = Abs x <$> reduce t <*> reduce v reduce (Pi x t v) = Pi x <$> reduce t <*> reduce v reduce (Free n) = envLookupVal n +reduce (Let _ v b) = pure $ subst 0 v b reduce e = pure e normalize :: Expr -> ReaderT Env Result Expr diff --git a/lib/Expr.hs b/lib/Expr.hs index c006b27..924751c 100644 --- a/lib/Expr.hs +++ b/lib/Expr.hs @@ -9,6 +9,7 @@ data Expr where App :: Expr -> Expr -> Expr Abs :: Text -> Expr -> Expr -> Expr Pi :: Text -> Expr -> Expr -> Expr + Let :: Text -> Expr -> Expr -> Expr deriving (Show, Ord) instance Eq Expr where @@ -20,6 +21,7 @@ instance Eq Expr where (App e1 e2) == (App f1 f2) = e1 == f1 && e2 == f2 (Abs _ t1 b1) == (Abs _ t2 b2) = t1 == t2 && b1 == b2 (Pi _ t1 b1) == (Pi _ t2 b2) = t1 == t2 && b1 == b2 + (Let _ v1 b1) == (Let _ v2 b2) = v1 == v2 && b1 == b2 _ == _ = False occursFree :: Integer -> Expr -> Bool @@ -31,6 +33,7 @@ occursFree _ Square = False occursFree n (App a b) = on (||) (occursFree n) a b occursFree n (Abs _ a b) = occursFree n a || occursFree (n + 1) b occursFree n (Pi _ a b) = occursFree n a || occursFree (n + 1) b +occursFree n (Let _ v b) = occursFree n v || occursFree (n + 1) b shiftIndices :: Integer -> Integer -> Expr -> Expr shiftIndices d c (Var x k) @@ -43,6 +46,7 @@ shiftIndices _ _ Square = Square shiftIndices d c (App m n) = App (shiftIndices d c m) (shiftIndices d c n) shiftIndices d c (Abs x m n) = Abs x (shiftIndices d c m) (shiftIndices d (c + 1) n) shiftIndices d c (Pi x m n) = Pi x (shiftIndices d c m) (shiftIndices d (c + 1) n) +shiftIndices d c (Let x v b) = Let x (shiftIndices d c v) (shiftIndices d (c + 1) b) incIndices :: Expr -> Expr incIndices = shiftIndices 1 0 @@ -58,13 +62,11 @@ collectLambdas (Abs x ty body) = ((x, ty) : params, final) (params, final) = collectLambdas body collectLambdas e = ([], e) -cleanNames :: Expr -> Expr -cleanNames (App m n) = App (cleanNames m) (cleanNames n) -cleanNames (Abs x ty body) = Abs x (cleanNames ty) (cleanNames body) -cleanNames (Pi x ty body) - | occursFree 0 body = Pi x (cleanNames ty) (cleanNames body) - | otherwise = Pi "" ty (cleanNames body) -cleanNames e = e +collectLets :: Expr -> ([(Text, Expr)], Expr) +collectLets (Let x val body) = ((x, val) : params, final) + where + (params, final) = collectLets body +collectLets e = ([], e) collectPis :: Expr -> ([(Text, Expr)], Expr) collectPis p@(Pi "" _ _) = ([], p) @@ -73,6 +75,14 @@ collectPis (Pi x ty body) = ((x, ty) : params, final) (params, final) = collectPis body collectPis e = ([], e) +cleanNames :: Expr -> Expr +cleanNames (App m n) = App (cleanNames m) (cleanNames n) +cleanNames (Abs x ty body) = Abs x (cleanNames ty) (cleanNames body) +cleanNames (Pi x ty body) + | occursFree 0 body = Pi x (cleanNames ty) (cleanNames body) + | otherwise = Pi "" ty (cleanNames body) +cleanNames e = e + groupParams :: [(Text, Expr)] -> [([Text], Expr)] groupParams = foldr addParam [] where @@ -85,6 +95,9 @@ groupParams = foldr addParam [] showParamGroup :: ([Text], Expr) -> Text showParamGroup (ids, ty) = parenthesize $ unwords ids <> " : " <> pretty ty +showBinding :: (Text, Expr) -> Text +showBinding (ident, val) = parenthesize $ ident <> " := " <> pretty val + helper :: Integer -> Expr -> Text helper _ (Var s _) = s helper _ (Free s) = s @@ -107,6 +120,10 @@ helper k e@(Abs{}) = if k >= 1 then parenthesize res else res (params, body) = collectLambdas e grouped = showParamGroup <$> groupParams params res = "λ " <> unwords grouped <> " . " <> pretty body +helper _ e@(Let{}) = res + where + (binds, body) = collectLets e + res = "let " <> unwords (map showBinding binds) <> " in " <> pretty body <> " end" pretty :: Expr -> Text pretty = helper 0 . cleanNames diff --git a/lib/Parser.hs b/lib/Parser.hs index 04709bf..31f6fed 100644 --- a/lib/Parser.hs +++ b/lib/Parser.hs @@ -49,11 +49,18 @@ lexeme = L.lexeme skipSpace eat :: Text -> Parser () eat = void . lexeme . chunk +keywords :: [Text] +keywords = ["forall", "let", "in", "end", "fun"] + pIdentifier :: Parser Text -pIdentifier = label "identifier" $ lexeme $ do +pIdentifier = try $ label "identifier" $ lexeme $ do firstChar <- letterChar <|> char '_' rest <- many $ alphaNumChar <|> char '_' - return $ T.pack (firstChar : rest) + let ident = T.pack (firstChar : rest) + when (ident `elem` keywords) $ + fail $ + "Reserved word: " ++ T.unpack ident + pure ident pVar :: Parser Expr pVar = label "variable" $ lexeme $ do @@ -101,6 +108,25 @@ pPAbs = lexeme $ label "Π-abstraction" $ withBinders $ do body <- pExpr pure $ foldr (uncurry Pi) body params +pBinding :: Parser (Text, Expr) +pBinding = lexeme $ label "binding" $ do + eat "(" + ident <- pIdentifier + eat ":=" + value <- pExpr + eat ")" + modify $ over bindsL (ident :) + pure (ident, value) + +pLet :: Parser Expr +pLet = lexeme $ label "let expression" $ withBinders $ do + eat "let" + bindings <- some pBinding + eat "in" + body <- try pExpr + eat "end" + pure $ foldr (uncurry Let) body bindings + pArrow :: Parser Expr pArrow = lexeme $ label "->" $ do a <- pAppTerm @@ -167,7 +193,14 @@ pTerm = ] pAppTerm :: Parser Expr -pAppTerm = lexeme $ pLAbs <|> pPAbs <|> pApp +pAppTerm = + lexeme $ + choice + [ pLAbs + , pPAbs + , pLet + , pApp + ] pExpr :: Parser Expr pExpr = lexeme $ try pArrow <|> pAppTerm