-
-
Notifications
You must be signed in to change notification settings - Fork 57
Defining mnemonics β #ruledef, #subruledef
You should use a #ruledef
block to define mnemonics and their binary encodings.
You can have as many #ruledef
blocks as you need, so you can easily split up your declarations.
You can combine any number of letters, words, and punctuation for a given mnemonic. Mnemonics are also case-insensitive.
Following the mnemonic, separated by a heavy arrow =>
,
you must indicate the instruction's binary encoding.
For example, these are all valid patterns:
#ruledef
{
nop => 0xff
mov a, #b => 0x35
sub x, [hl] => 0b11010001
add.gt r0, r3, r4, LSL #6 => 0x46
}
For the binary encoding, the way you express numeric values matter. Their size is derived from the number of digits given.
So, for example, 0x0
is four bits long, since it's a single hexadecimal digit, and 0x001
is 12 bits long β
which is to say: leading zeroes do matter!
If you want to use decimal values or explicitly express the size of a value, you can use
the slice `X
operator, as in 255`8
.
The slice operator works unrestricted with any value for the size, like `3
or `19
.
You can also use underscores _
to help with readability.
#ruledef
{
; this instruction outputs 8 bits
mov a, b => 0x35
; these instructions output 16 bits each
add a, b => 0x68_34
sub a, b => 0x00_02
}
You can also split up these values by using the concatenation operator @
:
#ruledef
{
; this instruction outputs 8 bits
mov a, b => 0b101 @ 0b11 @ 0b001
; this instruction outputs 16 bits
add a, b => 0x08 @ 0x3 @ 0b1001
}
So far, we've only defined fixed mnemonics. If you want to receive numerical arguments (e.g. for a "load immediate" instruction),
you can add as many parameters as you want with {}
:
#ruledef
{
load a, {value} => 0x55 @ value`8
}
This will allow the instruction to receive any kind of expression at the spot marked with braces {}
.
The received value can then be referenced by name on the binary encoding side.
It's recommended that the parameters of an instruction be separated by an unambiguous token, like
a comma, especially when there are multiple {}
expression parameters in sequence. The expression parser
is greedy, and you might run into problems if it can't distinguish between [two different arguments] and
[one argument with two expression terms]. For example, in load {a} {b}
, the parser will recognize
load 4 -7
as an invocation with a single expression argument 4 - 7
and will fail to match.
Note that, when the argument was used on the binary encoding side, it had to be given a slice `8
to truncate and explicitly indicate its size.
This is because we aren't constraining what type of arguments we can receive, so the assembler
will accept any size of value that's passed in, but it still needs the binary encoding to
have an explicit size indicated. To avoid this, you can use typed parameters, as seen below.
The slice operator takes the lowest N bits of the value and discards the rest. So here, the instruction will truncate the argument to 8 bits, and the instruction will output 16 bits as a whole.
You can invoke this instruction with simple numerical values, or more complex calculations, like so:
; using the above #ruledef
load a, 0x33
load a, 2 + 3 * 4
load a, (0x100 - 5) * 8
You may also "glue" your parameter slot to a fixed token on the left, by not placing
any whitespace between them, which allows you to easily accept things like the
ARM registers (r1
, r2
, r3
, and so on).
#ruledef
{
load r{reg_num}, {value} => 0x5 @ reg_num`4 @ value`8
}
; then you can use instructions like:
load r1, 0x12
load r2, 0x40 * 2
; you may even use more complex expressions,
; although the syntax might start to get confusing:
load r0xc, 0x40 * 2 ; same as if it used `r12`
load r3 + 3, 0x40 * 2 ; same as if it used `r6`
load r(4 + 4), 0x40 * 2 ; same as if it used `r8`
You can give types to parameters, in order to automatically constrain their sizes.
#ruledef
{
load.b a, {value: u8} => 0x55 @ value ; outputs 16 bits
load.w a, {value: s16} => 0x66 @ value ; outputs 24 bits
load.d a, {value: i32} => 0x77 @ value ; outputs 40 bits
}
A typed parameter automatically slices the received argument, truncating
the values to the given sizes, so you don't have to do it yourself.
It will also throw an error if you supply a value that's outside of its valid range.
The following are the valid types, and you can replace XX
with any number:
Type | Description | Example with 8 bits |
---|---|---|
uXX |
Unsigned values |
u8 will accept values from 0x00 to 0xff
|
sXX |
Signed values |
s8 will accept values from -0x80 to 0x7f
|
iXX |
Signed or unsigned values |
i8 will accept values from -0x80 to 0xff
|
You can also use the name of another #ruledef
block as the type for an instruction parameter.
This can be useful for creating named arguments, complex operands and addressing modes,
or simply to cut back on repeating yourself when the same pattern appears multiple times
across different mnemonics.
#ruledef register
{
a => 0x0
b => 0x1
c => 0x2
}
#ruledef
{
; here `r` is the parameter name, which is used on the binary encoding side,
; and `register` is the parameter type, referring to the #ruledef block above
load {r: register}, {value: i8} => 0x5 @ r @ value
}
; then you can use instructions like:
load a, 0x12
load b, 100
load c, -1
Note that, by the previous example, you're also able to use a
, b
, or c
directly as
instructions themselves. This is usually undesirable, so you can declare it as a #subruledef
instead:
#subruledef register
{
a => 0x0
b => 0x1
c => 0x2
}
#ruledef
{
load {r: register}, {value: i8} => 0x5 @ r @ value
}
#subruledef
has exactly the same syntax and semantics as the regular #ruledef
, but disallows
its mnemonics to be used as freestanding instructions.
You can create complex and deep mnemonics with nested rule parameters:
#subruledef register
{
a => 0x0
b => 0x1
c => 0x2
}
#subruledef source
{
{immediate: i16} => 0xd @ immediate
mem[{address: i16}] => 0xe @ address
ptr[{r: register}] => 0xf @ r`16
}
#ruledef
{
load {r: register}, {src: source} => 0x55 @ r @ src
add {r: register}, {src: source} => 0x66 @ r @ src
}
; then you can use instructions like:
load a, 0x12
load b, mem[0xff00]
add c, ptr[b]
- Getting started
- Defining mnemonics β #ruledef, #subruledef
- Declaring labels and constants
- Setting the minimum addressable unit β #bits
- Outputting data blocks β #d
- Working with banks β #bankdef, #bank
- Address manipulation directives β #addr, #align, #res
- Splitting your code into multiple files β #include, #once
- Advanced mnemonics, cascading, and deferred resolution β assert()
- Available expression operators and functions β incbin(), incbinstr(), inchexstr()
- Functions β #fn
- Conditional Compilation β #if, #elif, #else