Add Syntax

2024-12-12 20:21:20 -08:00 · 2024-12-12 20:21:20 -08:00 · 26a0b16371
commit 26a0b16371
parent 0bbab14f17
1 changed files with 114 additions and 0 deletions
--- a/Syntax.md
+++ b/Syntax.md
@ -0,0 +1,114 @@
 The syntax of perga is fairly straightforward, and should be familiar and intuitive to people familiar with the syntax of proof assistants and/or functional programming languages. On this wiki page, we will discuss the syntax of perga terms and top-level constructions. First, however, because it doesn't fit anywhere else, line comments are done via `--` like Haskell, and block comments are `[*` and `*]`, sort of like ML.
 # Terms
 There are six basic constructs in perga terms: variables, sorts, function application, lambda abstractions, pi abstractions, and let bindings.
 ## Variables
 Like in most programming languages, perga variables are indicated via identifiers. Perga identifiers are alphanumeric, and should feel familiar to anyone familiar with programming languages. Under the hood, they are locally nameless, and implemented with [de Bruijn indices](https://en.wikipedia.org/w/index.php?title=De_Bruijn_index), but this is immaterial to actually using perga. It is possible during a substitution for internally different variables with externally identical names to appear. When printing such a term, perga will rename one of the variables, to make it clear which is which.
 ## Sorts
 There are very few constructs built-in to perga. The most important one is `★`, the type of types (propositions). Most terms begin by abstracting over a type of type `★`. (Sidenote: this unicode character can be entered in vim using `<C-k>*2`or in emacs by entering the TeX or Agda input mode and typing `\star`). It is *not* the case that `★ : ★`, as that leads to paradoxes like [Girard's paradox](https://en.wikipedia.org/w/index.php?title=System_U#Girard's_paradox). Instead, `★ : □` (`<C-k>OS` in vim and `\square` in emacs), `□ : □1`, `□1 : □2`, etc. The type `★` differs from the others in that it is *impredicative*, while the others are *predicative*. `★` suffices the vast majority of the time.
 ## Function application
 Function application is indicated via juxtaposition as is standard in functional programming languages. So if `e1` and `e2` are perga terms, so is `e1 e2`. However, things can be complicated a bit by infix operators. Perga has a system for declaring custom infix operators with custom precedence and associativity that is very similar to Haskell's (more on that when we get to definitions). So if `+` is declared to be a binary infix operator with precedence lower than `*`, `1 + 2 * 3` will correctly parse as `1 + (2 * 3)`.  This is completely equivalent to `(+) 1 ((*) 2 3)`, as wrapping a binary operator in parentheses treats it like a normal function. Operators are syntactically distinguished from regular functions in that, like Haskell, they must consist entirely of symbols, while regular functions must have alphanumeric names.
 ## Lambda abstraction
 Lambda abstractions have a few different syntaxes in perga. The most common syntax is shown below.
 ```
 l_abs ::= fun <param_block>+ => <term>
 param_block ::= ( <ident>+ : <type> )
 ```
 Functions are indicated with the `fun` keyword (alternatively you can use `λ`), and followed by parameters which are named and given a type in a construct I call a *parameter block*. Parameter blocks consist of identifiers followed by a colon and a type all wrapped in parentheses. So some examples are `(A : ★)`, or `(x y : A)`, etc. This should feel familiar if you are familiar with Coq. The parameter blocks are separated from the body of the function by the `=>` keyword (alternatively given by the Unicode character `⇒`). Putting all this together, we can define the identity function by
 ```
 fun (A : ★) (x : A) => x
 ```
 In some context where `nat : ★` and `one : nat`, we could use this like shown below
 ```
 (fun (A : ★) (x : A) => x) nat one
 ```
 which would, of course, normalize to `one`. Unfortunately, implicit arguments are not supported yet, though that is absolutely planned (see issue #3).
 This syntax is very clear and easy to read, however it is also quite heavy. Functions appear very frequently in perga, so a lightweight alternative function syntax is also supported. Rather than `fun (A : ★) (x : A) => x`, you can write `[A : ★][x : A] x`. This is inspired by the Automath function syntax, but should also be familiar to those familiar with Rust, where anonymous functions, called closures by Rustaceans, are written `|x : i32| x` for example. Like the heavier function syntax, but unlike Automath or Rust, parameters of the same type can be grouped together, so `[A : ★][x y : A] y` is allowed. I recommend defaulting to the heavier syntax, as it is typically easier to read and follow, and only switching to the lighter syntax if the heavier syntax gets too verbose (e.g. when dealing with many awkwardly nested functions).
 ## Pi abstractions
 Perga is dependently typed. This means that the return type of a function can depend on the specific value of the input. As such, we need a syntax for indicating the type of a function whose output type depends on the input value. If the output type does *not* depend on the input value, it can be indicated with an arrow in the usual way, as, for example, `A -> B` or `A → B`.
 The type of the function `fun (x : A) => x` is simply `A -> A`. However, the type of  the full identity function `fun (A : ★) (x : A) => x` cannot be represented with simply an arrow, as the type of the inner function depends on the value of `A`, not just on its type `★`. The type of the identity function can be represented in perga syntax in a few ways, shown below.
 ```
 ∏ (A : ★) (x : A), A
 ∏ (A : ★), A -> A
 forall (A : ★), A -> A
 ∀ (A : ★), A -> A
 ```
 The type of a dependent function is introduced with the `forall` keyword (or `∏` or `∀` if you prefer the Unicode alternative), followed by a parameter block, a comma, and the body. You can think of this as being like a lambda abstraction, but rather than creating a term that depends on some input, it creates a type that depends on some input.
 ## Let bindings
 Finally, to wrap up the syntax of terms, we have let-bindings. The syntax for these is rather simple, and shown below.
 ```
 let ( <ident> <param_block>* := <term> )+ in <term> end
 ```
 A more concrete example is shown below.
 ```
 let (x := three) (y := four) in x + y end
 ```
 The purpose of the parameter blocks in the syntax is to allow easy definition of functions in let blocks. Compare the definition of `foo` with the definition of `bar` below.
 ```
 let (foo (x : A) := x)
      (bar := fun (x : A) => x)
 in ...
 end
 ```
 Furthermore, the type of the binding can optionally be given, whereupon perga will check that the type it infers matches the given type, so, provided `four : nat` and `nat` and `A` are different types,
 ```
 let (x : A := four) in ... end
 ```
 will not type check. Providing these types in let-bindings is never necessary, and usually just clutter, as let-bindings are primarily used for abbreviations for otherwise verbose repeated terms. Substantial definitions should be done through the definition system.
 # Top-level Constructs
 Perga allows different constructs at the top level and at the term level (somewhat analogous to the difference between Gallina and the Vernacular in Coq). There are currently five top level constructs in perga: definitions, axioms, fixity declarations, sections, and variables. It is not technically a top-level construct, but include statements are similar and will be discussed in this section.
 ## Definitions
 The syntax for definitions should be understandable by now.
 ```
 def <ident> <param_block>+ := <term> ;
 ```
 Like let-bindings, you can optionally indicate the "return" type after the parameter blocks. Unlike with let-bindings, I strongly encourage indicating types on top-level definitions, as it is good documentation, lets perga check your work, and leads to better error messages. Here are a few example definitions.
 ```
 def identity (A : ★) (x : A) : A := x;
 def const (A B : ★) (x : A) (y : B) : A := x;
 def church_nat : ★ := forall (A : ★), (A -> A) -> A -> A;
 def one : church_nat := fun (A : ★) (f : A -> A) (x : A) => f x;
 ```
 Declaring `one` to be of type `church_nat` is particularly good here, as it leads to perga remembering the type of `one` to be `church_nat`, rather than `forall (A : ★), (A -> A) -> A -> A`, which is the type it would automatically infer it to be, leading to much better error messages down the line.
 ## Axioms
 There are a lot of wonderful things you can construct in perga, but not everything. As such, like in mathematics more generally, it can be very useful to take certain propositions as axioms. In the case of perga, this amounts to a definition missing a body (and requiring a type annotation) that perga simply trusts exists. The syntax is just like definitions, but with the keyword `axiom` instead of `def`, the type annotation is mandatory, and there is no body. Some examples are shown below.
 ```
 axiom nat : ★;
 axiom zero : nat;
 axiom suc (n : nat) : nat;
 ```
 Note that that last example is equivalent to `axiom suc : nat -> nat;` or even `axiom suc : forall (n : nat), nat;`.
 ## Fixity Declarations
 You can define custom infix operators. Here's an example.
 ```
 def ∧ (A B : ★) : ★ := forall (C : ★), (A -> B -> C) -> C;
 infixl 10 ∧;
 ```
 You can make definitions a symbol as the definiendum as long as it is followed by a fixity declaration before its first use. A fixity declaration is given with `infixl` for left-associative operators and `infixr` for right-associative operators, followed by a natural number indicating the precedence, and the operator whose fixity is being defined. After a fixity declaration, the operator may be used freely as an infix operator in expressions.
 ## Sections and Variables
 Perga has a section mechanism comparable to Coq. It is easiest to see with an example.
 ```
 section Test
    variable (A B : ★);
    def identity (x : A) : A := x;
    def const (x : A) (y : B) : A := identity x;
 end Test
 ```
 Inside a section, which is delimited by `section <ident>` and `end <ident>` (the identifiers must match), variables can be declared with `variable <param_block>+ ;` or `hypothesis <param_block>+ ;`. Then, for any definition made within this section, if a variable is used, it is automatically added as a parameter to the definition. As such, the type of `identity` is really `forall (A : ★), A -> A`. Note that since `B` was not used in the definition of `identity`, it is not added as a parameter. Furthermore, notice that `A` was not given to `identity` in the body of `const`. Within a section, section variables are automatically added as arguments to terms where necessary. Additionally, sections can nest, and behave as you would expect them to, which can be useful for managing the scope of section variables if used frequently. Finally, the entire top-level is wrapped in an anonymous, implicit section, but I would not recommend using `variable` declarations at the top-level, instead preferring to use them within a section, to make the scope clearer.
 ## Include statements
 An ML-like module system is planned for perga, and with it proper scope management and a better thought-out system for multi-file projects, but for now a hacky, C-style preprocessor include directive is present to facilitate basic multi-file projects. Include statements function like in C, where they simply paste the contents of the referenced file. They are a touch smarter than that, detecting if a file will  be included multiple times and only including it once preventing the need for include guards, but not by much. The syntax is simple, just `@include <filepath>`. An example would be `@include logic.pg`. No need for quotes or angle brackets. The file path is relative to the current file.