Skip to content

Commit

Permalink
Significantly refactor and simplify flattening transformation, unify …
Browse files Browse the repository at this point in the history
…handling of builtins and functional predicates (#23)

## Background

### Functional predicates

I originally introduced **functional predicates** with the observation
that it's pretty nice to be able to write

```
size is 3.
color 1 is red.
color 2 is blue.
color 3 is green.
color 4 is orange.

a (color N) :- b N, N < size.
```

where the last rule is a shorthand for 

```
a Color :- b N, size is Size, N < Size, color N is Color.
```

Expanding out this shorthand is done in a step called the _flattening
transformation_.

### Built-in operatons

I also originally introduced **built-in operations** in a way that had a
complicated interaction with the term-matching infrastructure, so that,
for example, matching the term `3` against the pattern `succ X` with
`succ` registered as the `NAT_SUCC` operator would result in X being
bound to 2.

This introduced complexity into a pretty core part of the system,
**and** it wasn't clear what should happen if you try to write things
like `succ "Hello world"`.

One correct, well-understood, and uniform way to handle this sort of
operation is by treating the successor operator as an infinite relation
of the form `succ X is Y`, with the following entries:

```
succ 0 is 1.
succ 1 is 2.
succ 2 is 3.
...
```

This PR unifies the way both functional predicates and built-in
relations get treated by the static checker and the flattening
transformation.

## Built-ins are now uniformly treated as relations

Without this PR, Dusa doesn't allow you to write built-in operations as
relations. The following isn't allowed, for example:

```
#builtin NAT_SUCC succ
a 3.
b Y :- a X, succ X is Y.
```

In Dusa 0.0.14 and below, built-ins can only be referenced in their
functional form, not their relational form, even though the program was
compiling everything into the relational form. This is fixed, and under
this PR the program above is accepted.

## Functional predicates and builts-in are now uniformly translated to
relations in the flattening transformation

Functional predicates and built-ins are now turned into additional
predicates by the flattening transformation, a simple recursive process
that mimics instruction selection in a compiler:

```
#demand f X Y, h (plus X Y).
       -->
#demand f X Y, plus X Y is Z, h Z
```

These effects stack:

```
#demand f X Y, g (plus (plus X X) (plus X Y)).
       -->
#demand f X Y, plus X X is X2, plus X Y is Z, plus X2 Z is G, g G.
```

This transformation is exactly the same whether `plus` is a declared
relation or the builtin `INT_PLUS`.

## Modes for functional predicates are now much more generic, but still
have a special case for inequality

Modes for all built-ins, including the syntactically-supported ones like
equality and inequality, are now uniformly handled with a mode system.
This mode system has one quirk beyond the usual "input"/"output" thing,
which is that "wildcard" is allowed as an input mode to handle built-ins
that work like inequaltiy.

Consider the following rule:

```
a Ys :- b Xs, cons 4 Ys != Xs.
```

This rule will give the following error:

> The built-in relation NOT_EQUAL was given 2 arguments, and the
argument in position 1 contains variables not bound by previous
premises. This builtin does not support that mode of operation.

Intuitively, Dusa can't handle this, because there are always going to
be countably infinite `Ys` such that `cons 4 Ys != Xs`.

**However** for practical logic programming it's _really really useful_
to be able to say

```
a :- b Xs, cons 4 _ != Xs.
```

Which, logically, translates to "If b Xs, and not (Exists Ys. cons 4 Ys
== Xs), then a"

This is admittedly irregular with respect to everything else in the
language! But computationally it's quite reasonable, implementation-wise
it's not particularly complex, the logical meaning is entirely
decidable, and there are many cases where you avoid a lot of sad
verboseness if this doesn't exist.

The "wildcard" mode exists to support this specific use-case: a
"wildcard" moded argument is one that isn't grounded by its premises,
but any non-ground portions are limited to wildcard variables `_`.

## Backwards incompatibility

This PR simplifies the existing flattening transformation, but this
means that some previously-accepted programs now must be rejected. In
particular, this PR disables a common idiom _that was used in one of the
default sample programs_ (the graph generation example).

```
#builtin NAT_SUCC s
vertex 6.
vertex N :- vertex (s N).
```

This is no longer supported, and now returns this error

> Because s is the built-in relation NAT_SUCC, for it to be used like a
function symbol, all the arguments must be grounded by a previous
premise. If you want to use s with a different mode, write it out as a
separate premise, like 's * is *'.

This makes `NAT_SUCC` a lot less useful in general - you can still write
the program, following the advice of the error message, like this:

```
#builtin NAT_SUCC s
vertex 6.
vertex N :- vertex M, s N is M.
```

A better idiom in Rob's opinion now that this functionality doesn't
exist, is this one:

```
#builtin INT_MINUS minus
vertex 6.
vertex (minus N 1) :- vertex N, N > 0.
```

(Chris disagrees, fwiw, and still thinks the program using NAT_SUCC is
better. In any case we're not losing expressivity, both are possible.)

### Rationale for backwards incompatibility

I believe this backwards incompatibility is acceptable, at least in the
medium term, because it lets us uniformly turn all relations and
premises into *separate premises* with a very uniform translation. The
uniform translation, on the now-rejected program, looks like this:

```
#builtin NAT_SUCC s
vertex 6.
vertex N :- s N is M, vertex M.
```

...and that clearly doesn't work, as successor with no ground inputs has
an infinite number of possible ways of matching (N = 122, M = 123, or N
= 123, M = 124, or N = 124, M = 125, or N = 125, M = 126, and so on).

### Alternatives to introducing backwards compatibility

This was previously supported by a significantly more complex and
harder-to-motivate flattening transformation that "notices" that `(s N)`
is not ground in the first argument and instead translates the
problematic program above like this:

```
#builtin NAT_SUCC s
vertex 6.
vertex N :- vertex M, s N is M.
```

I put a fair amount of work into that kind of before-and-after
translation... it was very complex. I think disabling this functionality
is worthwhile in the name of simplicity.
  • Loading branch information
robsimmons authored Jun 13, 2024
1 parent 14ac885 commit 85f7369
Show file tree
Hide file tree
Showing 35 changed files with 1,548 additions and 1,583 deletions.
16 changes: 10 additions & 6 deletions docs/src/content/docs/docs/language/builtin.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
---
title: Built-in functions
title: Built-in relations
---

On their own, built-in numbers and strings act no different than other uninterpreted
constants, but they can be manipualted with special constructors added by `#builtin`
constants, but they can be manipulated with special relations added by `#builtin`
declarations.

A `#builtin` declarations change the lexer to represent certain identifiers as
operations on built-in types. If you write
A `#builtin` declarations connects a certain identifiers to a certain built-in
relation. If you write

#builtin INT_PLUS plus
#builtin NAT_SUCC s

then the identifiers `plus` and `s` will be parsed as a built-in definition instead
of as a regular identifiers until those built-ins are redefined.
then the identifiers `plus` and `s` will be treated, throughout the program, as a
built-in definition instead of as a regular identifier.

- The `NAT_ZERO` builtin takes no arguments and represents the natural number zero.
- The `NAT_SUCC` builtin takes one natural number argument, and adds one to it. If
Expand All @@ -23,3 +23,7 @@ of as a regular identifiers until those built-ins are redefined.
- The `INT_MINUS` builtin takes two integer arguments and returns an integer,
subtracting the second from the first.
- The `STRING_CONCAT` builtin takes two or more string arguments and concatenates them.

## How built-in relations work

All built-in relations
39 changes: 27 additions & 12 deletions proto/busa.proto
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ message Rule {
}

/*
conclusion <Args> is <Values> :- prefix X0 ... XN
conclusion <Args> :- prefix X0 ... XN
- Premise has arguments X0 ... XN in order
- Conclusion has arbitrary patterns with vars X0...XN
Expand All @@ -54,27 +54,39 @@ message Rule {
}

/*
conclusion <Args> is { <Values> } :- prefix X0 ... XN (exhaustive)
conclusion <Args> is { <Values>? } :- prefix X0 ... XN (non-exhaustive)
conclusion <Args> is? { <Values> } :- prefix X0 ... XN
- Premise has arguments X0 ... XN in order
- Conclusion has arbitrary patterns with vars X0...XN
- Conclusion is a fact with one value
*/
message ChoiceConclusion {
message OpenConclusion {
string conclusion = 1;
repeated Pattern args = 2;
repeated Pattern choices = 3;
bool exhaustive = 4;
string prefix = 5;
}

/*
conclusion Y3 X1 Z4 :- prefix X0 X1 Y2 Y3, fact X0 X1 Z2 is Z3 Z4.
conclusion <Args> is { <Values> } :- prefix X0 ... XN
- Prefix and fact premise share first N arguments X0...XN
- prefix can have additional arguments ...YM, M >= N
- fact can have additional arguments in args and values ...ZP, P >= N
- Premise has arguments X0 ... XN in order
- Conclusion has arbitrary patterns with vars X0...XN
- Conclusion is a fact with one value
*/
message ClosedConclusion {
string conclusion = 1;
repeated Pattern args = 2;
repeated Pattern choices = 3;
string prefix = 5;
}

/*
conclusion Fact1 Shared1 Prefix1 FactValue :- prefix Shared0 Shared1 Prefix0 Prefix1, fact Shared0 Shared1 Fact0 Fact1 is FactValue.
- Prefix and fact premise share first N arguments Shared0...SharedN
- prefix can have additional arguments Prefix0...PrefixM
- fact can have additional arguments in args and values Fact0...FactP
- conclusion is another prefix and has no repeat variables
*/
message Join {
Expand Down Expand Up @@ -114,9 +126,12 @@ message Rule {
INT_MINUS = 5;
INT_TIMES = 6;
STRING_CONCAT = 7;
EQUAL = 8;
GT = 9;
GEQ = 10;
CHECK_GT = 8;
CHECK_GEQ = 9;
CHECK_LT = 10;
CHECK_LEQ = 11;
EQUAL = 12;
NOT_EQUAL = 13;
}

string conclusion = 1;
Expand Down
31 changes: 20 additions & 11 deletions src/client.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { Data, TRIV_DATA, getRef, hide } from './datastructures/data.js';
import { Data, TRIVIAL, getRef, hide } from './datastructures/data.js';
import {
ChoiceTree,
ChoiceTreeNode,
Expand All @@ -17,6 +17,7 @@ import {
} from './engine/forwardengine.js';
import { check } from './language/check.js';
import { compile } from './language/compile.js';
import { builtinModes } from './language/dusa-builtins.js';
import { parse } from './language/dusa-parser.js';
import { IndexedProgram } from './language/indexize.js';
import { Issue } from './parsing/parser.js';
Expand Down Expand Up @@ -136,7 +137,7 @@ function* solutionGenerator(
export class Dusa {
private program: IndexedProgram;
private debug: boolean;
private arities: Map<string, number>;
private arities: Map<string, { args: number; value: boolean }>;
private db: Database;
private stats: Stats;
private cachedSolution: DusaSolution | null = null;
Expand All @@ -154,19 +155,19 @@ export class Dusa {
throw new DusaError(parsed.errors);
}

const { errors, arities } = check(parsed.document);
const { errors, arities, builtins } = check(builtinModes, parsed.document);
if (errors.length !== 0) {
throw new DusaError(errors);
}

this.debug = debug;
this.arities = arities;
this.program = compile(parsed.document, debug);
this.program = compile(builtins, arities, parsed.document, debug);
this.db = makeInitialDb(this.program);
this.stats = { cycles: 0, deadEnds: 0 };
}

private checkPredicateForm(pred: string, arity: number) {
private checkPredicateForm(pred: string, arity: { args: number; value: boolean }) {
const expected = this.arities.get(pred);
if (!pred.match(/^[a-z][A-Za-z0-9]*$/)) {
throw new DusaError([
Expand All @@ -179,16 +180,24 @@ export class Dusa {
}
if (expected === undefined) {
this.arities.set(pred, arity);
} else if (arity !== expected) {
} else if (arity.args !== expected.args) {
throw new DusaError([
{
type: 'Issue',
msg: `Predicate ${pred} should have ${expected} argument${
expected === 1 ? '' : 's'
expected.args === 1 ? '' : 's'
}, but the asserted fact has ${arity}`,
severity: 'error',
},
]);
} else if (arity.value !== expected.value) {
throw new DusaError([
{
type: 'Issue',
msg: `Predicate ${pred} should ${expected.value ? '' : 'not '}have a value, but the asserted fact ${arity.value ? 'has' : 'does not have'} one.`,
severity: 'error',
},
]);
}
}

Expand All @@ -200,11 +209,11 @@ export class Dusa {
this.cachedSolution = null;
this.db = { ...this.db };
for (const { name, args, value } of facts) {
this.checkPredicateForm(name, args.length);
this.checkPredicateForm(name, { args: args?.length ?? 0, value: value !== undefined });
insertFact(
name,
args.map(termToData),
value === undefined ? TRIV_DATA : termToData(value),
args?.map(termToData) ?? [],
value === undefined ? TRIVIAL : termToData(value),
this.db,
);
}
Expand All @@ -221,7 +230,7 @@ export class Dusa {
this.db = { ...this.db };

if (pred !== undefined) {
this.checkPredicateForm(pred, 2);
this.checkPredicateForm(pred, { args: 2, value: true });
}
const usedPred = pred ?? '->';
const triples: [Data, Data, Data][] = [];
Expand Down
4 changes: 2 additions & 2 deletions src/datastructures/data.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { DataView, dataToString, expose, hide } from './data.js';

test('Internalizing basic types', () => {
const testData: DataView[] = [
{ type: 'triv' },
{ type: 'trivial' },
{ type: 'int', value: 123n },
{ type: 'int', value: 0n },
{ type: 'string', value: 'abc' },
Expand All @@ -23,7 +23,7 @@ test('Internalizing basic types', () => {
}
}

expect(hide({ type: 'triv' })).toEqual(hide({ type: 'triv' }));
expect(hide({ type: 'trivial' })).toEqual(hide({ type: 'trivial' }));
});

test('Internalizing fibonacci-shaped structured types', () => {
Expand Down
25 changes: 13 additions & 12 deletions src/datastructures/data.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
export type Data = ViewsIndex | bigint;

export type DataView =
| { type: 'triv' }
| { type: 'trivial' }
| { type: 'int'; value: bigint }
| { type: 'bool'; value: boolean }
| { type: 'string'; value: string }
Expand All @@ -11,7 +11,7 @@ export type DataView =
type ViewsIndex = number;
let nextRef: number = -1;
let views: DataView[] = [
{ type: 'triv' },
{ type: 'trivial' },
{ type: 'bool', value: true },
{ type: 'bool', value: false },
];
Expand All @@ -20,12 +20,12 @@ let structures: { [name: string]: DataTrie } = {};

export function DANGER_RESET_DATA() {
nextRef = -1;
views = [{ type: 'triv' }, { type: 'bool', value: true }, { type: 'bool', value: false }];
views = [{ type: 'trivial' }, { type: 'bool', value: true }, { type: 'bool', value: false }];
strings = {};
structures = {};
}

export const TRIV_DATA = 0;
export const TRIVIAL = 0;
export const BOOL_TRUE = 1;
export const BOOL_FALSE = 2;

Expand Down Expand Up @@ -79,7 +79,7 @@ function setStructureIndex(name: string, args: Data[], value: ViewsIndex) {

export function hide(d: DataView): Data {
switch (d.type) {
case 'triv':
case 'trivial':
return 0;
case 'int':
return d.value;
Expand Down Expand Up @@ -113,26 +113,27 @@ export function compareData(a: Data, b: Data): number {
const x = expose(a);
const y = expose(b);
switch (x.type) {
case 'triv':
if (y.type === 'triv') return 0;
case 'trivial':
if (y.type === 'trivial') return 0;
return -1;
case 'int':
if (y.type === 'triv') return 1;
if (y.type === 'trivial') return 1;
if (y.type === 'int') {
const c = x.value - y.value;
return c > 0n ? 1 : c < 0n ? -1 : 0;
}
return -1;
case 'bool':
if (y.type === 'triv' || y.type === 'int') return 1;
if (y.type === 'trivial' || y.type === 'int') return 1;
if (y.type === 'bool') return (x.value ? 1 : 0) - (y.value ? 1 : 0);
return -1;
case 'ref':
if (y.type === 'triv' || y.type === 'int' || y.type === 'bool') return 1;
if (y.type === 'trivial' || y.type === 'int' || y.type === 'bool') return 1;
if (y.type === 'ref') return x.value - y.value;
return -1;
case 'string':
if (y.type === 'triv' || y.type === 'int' || y.type === 'bool' || y.type === 'ref') return 1;
if (y.type === 'trivial' || y.type === 'int' || y.type === 'bool' || y.type === 'ref')
return 1;
if (y.type === 'string') return x.value > y.value ? 1 : x.value < y.value ? -1 : 0;
return -1;
case 'const':
Expand Down Expand Up @@ -181,7 +182,7 @@ export function escapeString(input: string): string {
export function dataToString(d: Data, needsParens = true): string {
const view = expose(d);
switch (view.type) {
case 'triv':
case 'trivial':
return `()`;
case 'int':
return `${view.value}`;
Expand Down
Loading

0 comments on commit 85f7369

Please sign in to comment.