Skip to content

Latest commit

 

History

History
169 lines (117 loc) · 5.89 KB

README.md

File metadata and controls

169 lines (117 loc) · 5.89 KB

parents

a bytecode vm and compiler for a minimal lisp

Why

to practice typescript: "paren" + "ts" === "parents"

Features

The minimal set needed for implementing cons lists (see examples/list.lisp) and the Church encoding (see examples/church.lisp) in user space using lambda caclulus. Plus console output and assertions.

  • simple types: integers, booleans (#t and #f), and nil
  • first-class functions with lexical scope and closures
  • special forms: if, lambda, define, seq, let
  • built-ins: +, -, *, <, =, nil, isnil, display, assert

For example, here's fibonacci:

(define (fib n)
    (if (< n 2)
        n
        (+ (rec (- n 1)) (rec (- n 2)))))

See the examples/ directory for more.

Status

Essentially complete. All the example programs compile and run correctly. There are closures and recursion and cons lists in user space and garbage collection.

Future work:

  • more unit tests
  • string support
  • first-class cons list support
  • better command-line interface
  • better input/output support
  • implement some advent of code solutions as examples
  • repl
  • remote attach
  • additional refactoring as I learn more TypeScript

Usage

To install globally:

npm install -g @dhconnelly/parents

For development, you can check out the code, build, run the tests, and link your local build:

git clone [email protected]:dhconnelly/parents.git
cd parents
npm install
npm test
npm link

Or, to install nothing, you can use npx @dhconnelly/parents in place of the parents command.

To compile some programs:

parents compile [file ...]

This will write file.bytecode in the same directory as each file passed as an argument. To run the generated bytecode:

parents vm [bytecode_file ...]

For example, to build and run all the examples from the examples/ directory:

parents compile examples/*.lisp
parents vm examples/*.lisp.bytecode

To disassemble the bytecode and see the VM instructions:

parents disasm [bytecode_file ...]

There's also a tree-walking interpreter, in case you don't want to compile:

parents run [file ...]

Implementation Details

There are two implementations:

  • a tree-walking interpreter that relies on the JavaScript VM to handle garbage collection (in src/interpreter)

  • a compiler (in src/bytecode/compiler) that targets a stack-based virtual machine (see src/bytecode/vm) which uses a naive mark-and-sweep garbage collection scheme (in src/bytecode/vm/heap.ts).

Both implementations use a hand-written recursive-descent parser (see src/lexer.ts and src/parser.ts).

Both seq and let are implemented by desugaring to immediately-invoked lambda expressions. For simplicity, cons lists are not provided and can be implemented in user space using lambdas; see examples/list.lisp for an example.

Virtual Machine

Contains a stack, heap, and globals table. For simplicity, garbage collection (mark and sweep, with the stack and globals forming the root set) is potentially triggered only at instruction step time.

The supported instruction set is as follows:

  • Push <type> <value>: Push a literal value onto the stack

  • Pop: Pop the topmost value off the stack and discard it

  • Get <index>: Get a value from the stack and push it on top

  • DefGlobal: Pop the topmost stack element and add it as a global

  • GetGlobal <index>: Get the specified global and push it on the stack

  • Jmp <pc>: Jump to the given byte offset in the program and continue

  • JmpIf <pc>: Pop the topmost stack value and jump if it is true

  • MakeLambda <pc> <arity> <captures>: Allocate a lambda object representing a function that begins at byte offset <pc> with <arity> parameters onto the heap, with <captures> values popped off the stack and copied into the lambda. Pushes a pointer to the lambda on top of the stack.

  • Call <arity>: Pops a lambda pointer (as pushed by MakeLambda) from the stack and invokes it with <arity> values. Assumes the argument values are on the stack just below the lambda pointer. Pushes a pointer to the invoked lambda (to support recursive calls) and all the captures, then jumps to the byte offset <pc> in the lambda. The stack will therefore contain, once execution continues, [...args, fn_ptr, ...captures] and the top of the stack will point just after the captures. Saves the return address (one instr past the offset at which Call was executed).

  • Return: Pops and saves the top of the stack (the return value), pops all of the captures and recursive lambda pointer and arguments from the stack, pushes the return value back on top, updates the <pc> to the return address saved by Call, and continues execution.

For more details, see src/instr.ts (for the instruction definitions and serialization/deserialization to/from bytecode) and src/vm.ts for more implementation details. Runtime types and values support is found in src/values.ts (for types common to both the vm and the interpreter) and src/vm/values.ts.

Questions

Is this a good idea?

No. In particular, since the VM and garbage collector are implemented in TypeScript and run on Node.js, the V8 garbage collector can interrupt our own garbage collector to collect garbage. Also, since this is essentially JavaScript, a stray reference to a heap object can prevent cleanup.

Why is the code style all over the place?

This is my first TypeScript project. I was playing with the features and patterns. Exceptions and monadic error handling are intermingled, and in some places types vs. interfaces vs. classes are not used consistently or coherently. Sometimes there's more imperative looping, sometimes more forEach/map/etc. Not quite sure how to consistently structure discriminated unions. And so on.

Who

Daniel Connelly [email protected] (https://dhconnelly.com)

License

MIT License. Copyright (c) 2021 Daniel Connelly