An Idol schema is written as UTF-8 text. The preferred file extension is .idol
.
# A simple example Idol schema.
namespace "acme.example/hello-world"
const FRIENDLY_GREETING: text = "Hello, world!"
message Hello {
greeting@1: text
}
The order of productions within a schema is: namespace, imports, exports, options, and declarations. The namespace is required; other productions are optional.
All Unicode scalar values are permitted except for non-space ASCII control characters. Forbidden characters are:
%x00-08
(ASCIINUL
toBS
)%x0B
(ASCIIVT
)%x0C
(ASCIIFF
)%x0E-1F
(ASCIISO
toUS
)%x7F
(ASCIIDEL
)
Space characters %x09
(horizontal tab), %x20
(space), and %xA0
(non-breaking space) are permitted between most tokens (the exceptions are noted).
Newlines %x0A
(line feed) and %x0D.0A
(\r\n
, carriage return + line feed) are permitted between some tokens. The scalar value %x0D
(carriage return) is only permitted as part of a %x0D.0A
sequence.
SP = %x09 / %x20 / %xA0
NL = %x0A / %x0D.0A
Comments start with %x23
(#
), and continue until the next newline or end-of-file. Comments starting with %x23.23
are documentation comments ("doc comments"), which may be extracted by documentation generators or included in generated files.
# This is a comment.
## This is a doc comment.
The Idol schema parser reports comment text as-is, without removing the #
prefix or trimming spaces.
COMMENT = %x23 *(%x09 / %x20-7E / %x80-D7FF / %xE000-10FFFF)
Identifiers must start with %x61-7A / %x41-5A
([a-zA-Z]
), contain %x61-7A / %x41-5A / %x30-39 / %x5F
([a-zA-Z0-9_]
), and end with %x61-7A / %x41-5A / %x30-39
([a-zA-Z0-9]
).
In other words, they must start with an ASCII letter, contain ASCII alphanumeric characters plus underscores, and must not end with an underscore.
IDENT = ALPHA *( ALPHA-NUM / %x5F ALPHA-NUM )
ALPHA = %x61-7A / %x41-5A
ALPHA-NUM = ALPHA / %x30-39
There are no keywords or reserved identifiers in the Idol schema syntax. The following schema is permitted (though not recommended):
# Define a structure type named `struct`.
struct struct {
const: bool
}
# Define a message type named `message`, containing a field named `struct`,
# which is of the type `struct` defined above.
message message {
struct@1: struct
}
Source code generators may transform or reject identifiers that don't conform to the rules of their target languages, for example by capitalizing type names or adding underscores to names that are language keywords.
The Idol schema compiler may reject schemas in which shadowing of a built-in type renders the meaning ambiguous to a human reader:
# Permitted
enum bool: u8 {
true = 0
false = 1
}
# Error: Built-in type `bool` shadowed by user-defined type.
const A: bool = .true
Type name | Description |
---|---|
bool |
Booleans (.true or .false ) |
u8 , u16 , u32 , u64 |
Unsigned integers |
i8 , i16 , i32 , i64 |
Signed integers |
f32 , f64 |
IEEE-754 floating-point |
text |
UTF-8 encoded text without NUL |
asciz |
Arbitrary bytes without NUL |
handle |
Handle to ancillary data passed via an IPC/RPC protocol. |
Array types are defined by appending []
(for a dynamically-sized array) or [<int literal>]
for a fixed-size array. For example, a field containing arbitrary bytes might be typed as u8[]
, and a field containing an XYZ coordiate might be f32[3]
.
Handles are unsigned 32-bit integers that represent an index into an array of values propagated separately from Idol messages, within the context of an IPC/RPC protocol.
- A Unix process would use a
handle
to identify the index of a file descriptor passed viaSCM_RIGHTS
. - A distributed system performing network calls would use a
handle
to identify a cryptographic credential passed via an HTTP header.
Handles are not directly written to encoded Idol data structures, but are obtained from a list passed to the message decoder.
Integer literals have an optional base prefix, and may be negative:
0
,42
, and-42
are integer literals. The digit0
is only permitted by itself, not as a leading zero (00
and01
are invalid).0b101010
is an integer literal in base 2 (binary), and may contain leading zeroes.0o52
is an integer literal in base 8 (octal), and may contain leading zeroes.0d42
is an integer literal in base 10 (decimal), and may contain leading zeroes.0x2a
or0x2A
are integer literals in base 16 (hex), and may contain leading zeroes.
INT-LIT = %x30
INT-LIT =/ [%x2D] %x31-39 *(%x30-39)
BIN-INT-LIT = %x30.62 1*(%x30-31)
BIN-INT-LIT =/ %x2D %x30.62 *(%x30) %x31 *(%x30-31)
OCT-INT-LIT = %x30.6F 1*(%x30-37)
OCT-INT-LIT =/ %x2D %x30.6F *(%x30) %x31-37 *(%x30-37)
DEC-INT-LIT = %x30.64 1*(%x30-39)
DEC-INT-LIT =/ %x2D %x30.64 *(%x30) %x31-39 *(%x30-39)
HEX-INT-LIT = %x30.78 1*(%x30-39 / %x41-46 / %x61-66)
HEX-INT-LIT =/ %x2D %x30.78 *(%x30) (%x31-39 / %x41-46 / %x61-66) *(%x30-39 / %x41-46 / %x61-66)
ANY-INT-LIT = INT-LIT / IN-INT-LIT / OCT-INT-LIT / DEC-INT-LIT / HEX-INT-LIT
Some locations in the syntax that accept integer literals forbid base prefixes, typically when the use of a non-decimal base would be unexpected by the reader.
Text literals are quoted with %x22
(double quote), and may contain Unicode scalar values. The following escape sequences are allowed:
\\
decodes to%x5C
(backslash)\"
decodes to%x22
\n
decodes to%x0A
\xNN
decodes to the Unicode scalar valueU+00NN
, with exactly two hex digits allowed.\u{NNNN}
decodes to the Unicode scalar valueU+NNNN
, with 1 to 6 hex digits allowed.
TEXT-LIT = %x22 *(
%x09 / %x20 / %x21 /
%x23-5B /
%x5D-7E /
%x80-D7FF / %xE000-10FFFF
(%x5C (
%x22 /
%x5C /
%x6E /
%x78 2(%x31-39 / %x41-46 / %x61-66) /
%x75.7B 1*6(%x31-39 / %x41-46 / %x61-66) %x7D
))
) %x22
As the source is in UTF-8, non-ASCII characters may be used freely in text literals. Note that the text
primitive type forbids the scalar value U+0000
.
Bytes literals are quoted with %x22
(double quote), and may contain scalar values in the range %x01-FF
. The following escape sequences are allowed:
\\
decodes to%x5C
(backslash)\"
decodes to%x22
\n
decodes to%x0A
\xNN
decodes to the byte%xNN
, with exactly two hex digits allowed.\u{NNNN}
decodes to the UTF-8 encoding of the Unicode scalar valueU+NNNN
, with 1 to 6 hex digits allowed.
As the source is in UTF-8, non-ASCII characters in bytes literals will be represented as their UTF-8 encoding. Note that the asciz
primitive type forbids the byte 0x00
.
BYTES-LIT = %x22 *(
%x09 / %x20 / %x21 /
%x23-5B /
%x5D-7E /
%x80-D7FF / %xE000-10FFFF
(%x5C (
%x22 /
%x5C /
%x6E /
%x78 2(%x31-39 / %x41-46 / %x61-66) /
%x75.7B 1*6(%x31-39 / %x41-46 / %x61-66) %x7D
))
) %x22
Syntatically, text and bytes literals are the same. They are distinguished within the schema compiler's type checker at the location of use.
There is no dedicated literal syntax for boolean values. Instead, they use the same syntax as enum item references.
const A: bool = .false
const B: bool = .true
The productions that can only occur before declarations are collectively called the "header".
Each schema must have a namespace, which provides scoping within the set of schemas being compiled together. The namespace is declared with the namespace
production.
namespace "acme.example/hello-world"
The namespace is a text literal, which may contain any Unicode scalar value except for non-space ASCII control characters. The set of excluded characters is the same as for Idol schema source code.
It is recommended, but not required, that namespaces use some sort of hierarchical convention rooted under a DNS domain name controlled by the authors of the schema.
The prefix "idol/"
is reserved for use by the Idol schema compiler.
namespace = %s"namespace" *SP TEXT-LIT
Declarations from other schema files may be imported with the import
production. Imported declarations may be used like any
namespace "acme.example/hello-world"
import "acme.example/i10n" { Language }
message Hello {
greeting@1: text
@{optional}
language@2: Language
}
Instead of importing specific declarations, the entire namespace can be given a local name and used to qualify type names.
A type name such as i10n.Language
may not contain spaces before or after the dot.
namespace "acme.example/hello-world"
import "acme.example/i10n" as i10n
message Hello {
greeting@1: text
@{optional}
language@2: i10n.Language
}
Importing declarations from the same namespace is permitted, and is useful for splitting up a pre-existing schema file into separate smaller files.
namespace "acme.example/hello-world"
import "acme.example/hello-world" { Greeting }
message SayHello {
greeting@1: Greeting
}
import = %s"import" *SP TEXT-LIT *SP (imports / import-as)
imports = %x7B
*(SP / NL / COMMENT NL)
*(IDENT *(SP / NL / COMMENT NL))
%x7D
import-as = %s"as" 1*SP IDENT
Declarations defined in the current file are always exported. Use the export
production to re-export declarations imported from other namespace.
namespace "acme.example/i10n/v2"
import "acme.example/i10n" { Language }
export { Language }
Dotted type names can also be used:
namespace "acme.example/i10n/v2"
import "acme.example/i10n" as i10n
export { i10n.Language }
Exported declarations can be renamed:
namespace "acme.example/i10n/v2"
import "acme.example/i10n" as i10n
export i10n.Language as LanguageCode
export = %s"export" (*SP exports / 1*SP export-as)
exports = %x7B
*(SP / NL / COMMENT NL)
*(type-name *(SP / NL / COMMENT NL))
%x7D
export-as = type-name 1*SP %s"as" 1*SP IDENT
type-name = [ IDENT %x2E ] IDENT
The options
production can be used to configure how the Idol schema compiler and other tooling processes the current schema file.
namespace "acme.example/hello-world"
options {
some_idolc_option = "a special value"
}
By default, options are validated against the schema used by the Idol schema compiler itself. Options intended for other tools should specify an options schema:
namespace "acme.example/hello-world"
import "acme.example/idol-codegen-cxx" {
CxxCodegenOptions
}
options: CxxCodegenOptions {
namespace = "acme::hello_idl"
}
At present the Idol schema compiler has no options defined, so the only use for top-level options
is to configure external tools via an imported schema.
options = %s"options" [ *SP %x3A *SP type-name ] *SP %x7B
*(SP / NL / COMMENT NL)
*(options-option *(SP / NL / COMMENT NL))
%x7D
options-option = option-name *SP %x3D *SP option-value
option-name = IDENT *(%x2E IDENT)
option-value = ANY-INT-LIT / TEXT-LIT / %x2E IDENT
Types and values defined within an Idol schema file are known collectively as "declarations". They may be declared in any order, so long as they come after the productions found in the header.
Use const
to declare a constant value of some primitive type.
# 1 second in milliseconds
const TIMEOUT_MSEC: u32 = 1000
# default TLS server authority
const TLS_AUTHORITY = "api.example.com"
The following types are supported for constant declarations:
bool
,u8
,u16
,u32
,u64
,i8
,i16
,i32
,i64
f32
andf64
, which accept integer integers.text
, which accepts text literals.asciz
, which accepts bytes literals. The trailingNUL
byte is automatically appended.u8[]
, which accepts bytes literals.
const = %s"const" *SP IDENT *SP %x3A *SP type-name *SP %x3D *SP const-value
const-value = ANY-INT-LIT / TEXT-LIT / %x2E IDENT / const-name
const-name = [ IDENT %x2E ] IDENT
Constants may have options -- see the section on struct fields for details on the syntax.
Supported constant options are:
deprecated: bool
: whether the constant is deprecated.
Use enum
to declare an enumeration of named integers. The enum must be derived from an integer type (u8
, i8
, u16
, i16
, u32
, i32
, u64
, or i64
).
enum HttpStatus: u16 {
OK = 200
ERR_NOT_FOUND = 404
ERR_FORBIDDEN = 403
ERR_INTERNAL_ERROR = 500
}
Enums derived from a signed integer may have items assigned negative values.
enum errno: i8 {
EPERM = -1
ENOENT = -2
EINTR = -4
}
Integer literals assigned to enum items may have a base prefix.
enum FcntlFlags: u32 {
O_CREAT = 0o100
O_EXCL = 0o200
O_NOCTTY = 0o400
O_TRUNC = 0o1000
}
enum = %s"enum" *SP IDENT *SP %x3A *SP IDENT *SP %x7B
*(SP / NL / COMMENT NL)
*(enum-item *(SP / NL / COMMENT NL))
%x7D
enum-item = IDENT *SP %x3D *SP ( ANY-INT-LIT / const-name )
Enums and enum items may have options -- see the section on struct fields for details on the syntax.
Supported enum and enum item options are:
deprecated: bool
: whether the enum / enum item is deprecated.
Use struct
to declare a fixed-size record type. Struct fields may be of any type that has a known size -- note that this includes other structs, but does not include text
, asciz
, or messages.
struct = %s"struct" *SP IDENT *SP %x7B
*(SP / NL / COMMENT NL)
*(struct-field *(SP / NL / COMMENT NL))
%x7D
struct-field = IDENT *SP %x3A *SP type-name
Structs are typically used when the set of fields is small and stable, because they are more efficient than messages but can be difficult or impossible to change later without breaking binary compatibility.
# More convenient than `coordinate: f32[3]`
struct Coordinate {
x: f32
y: f32
z: f32
}
Structs can also be used to provide names and type-safety to types that represent some other form of serialized data.
struct Sha256Checksum {
bytes: u8[32]
}
Struct fields follow the same alignment and padding system as C:
- Numeric types
u*
,i*
,f*
, andhandle
are aligned to their size. - Arrays have the same alignment as their value type.
- Structures are aligned to their most-aligned field.
- Padding is implicitly added between structure fields to ensure alignment.
In general adding new fields to a struct will break binary compatibility, because all encoders and decoders must agree on the struct's size and alignment. The exceptions are:
- A new field may replace padding.
- The declaration
struct { a: u8 b: u16 }
may be modified tostruct { a: u8 c: u8 b: u16 }
.
- The declaration
- A new field may replace existing fields if doing so would not change the struct's alignment.
- The declaration
struct { a: u32 reserved: u8[4] }
may be modified tostruct { a: u32 b: u16 c: u16 }
.
- The declaration
If a struct is expected to have new fields added in the future, it is recommended to reserve some empty space within the struct for that purpose.
Structs and struct fields may have options, specified with one of (1) @options { }
, (2) the single-option shorthand syntax @{ option_name = "value" }
, or (3) the even shorter bool
-specific syntax @{option_name}
:
@{deprecated = .true}
struct OldStruct {
dont_use_me: u32
}
struct SomeStruct {
a: u32
@{deprecated}
b: u32
}
Supported struct and struct field options are:
deprecated: bool
: whether the struct / struct field is deprecated.
Struct and struct field options may also have imported schemas:
import "acme.example/idol-codegen-go" {
GoCodegenFieldOptions
}
struct SomeStruct {
@options: GoCodegenFieldOptions { accessor_name = "IsNew" }
new: bool
}
Use message
to declare a dynamically-sized record type. A message may contain fields of any type, including itself. Fields may be optional, in which case generated code will allow detecting whether the field value was present in the message data.
message Hello {
greeting@1: text
language_id@2: u32
}
Each field has a tag such as @1
, which is an integer in the range [1, 65535]
. The message size depends on the largest tag that is present within the message data: A message containing a u32
with tag N
will be 8 + (N * 8)
bytes.
message = %s"message" *SP IDENT *SP %x7B
*(SP / NL / COMMENT NL)
*(message-field *(SP / NL / COMMENT NL))
%x7D
message-field = IDENT *SP tag *SP %x3A *SP type-name
tag = %x40 %x31-39 *(%x30-39)
A single dynamically-sized field may contain up to 0x7FF00000 - 16
bytes of data if it is the only field in the message and has a tag of 1
.
Messages and message fields may have options -- see the section on struct options for details on the syntax.
Supported message options are:
deprecated: bool
: whether the message is deprecated.
Supported message field options are:
deprecated: bool
: whether the field is deprecated.optional: bool
: Whether generated code should provide API for testing field presence.
Use union
to declare a dynamically-sized tagged union. A union is similar to a message, but it contains at most a single value (identified by field tag).
union DivisionResult {
result@1: f32
error@2: ErrorCode
}
union = %s"union" *SP IDENT *SP %x7B
*(SP / NL / COMMENT NL)
*(union-field *(SP / NL / COMMENT NL))
%x7D
union-field = IDENT *SP tag *SP %x3A *SP type-name
Union fields may also be optional, in which case the union header specifies which field might be set.
# Code review might succeed without any comments from the reviewer.
union CodeReviewResult {
@{optional}
comments@1: text
error@2: ErrorCode
}
Unions and union fields may have options -- see the section on struct options for details on the syntax.
Supported union options are:
deprecated: bool
: whether the union is deprecated.
Supported union field options are:
deprecated: bool
: whether the field is deprecated.optional: bool
: Whether generated code should provide API for testing field presence, in addition to testing the union's field tag.
Use protocol
to declare the protocol of an RPC-style service.
protocol Greeter {
rpc Greet(GreetRequest): GreetResponse # can also `: (GreetResponse)`
}
protocol = %s"protocol" *SP IDENT *SP %x7B
*(SP / NL / COMMENT NL)
*(protocol-method *(SP / NL / COMMENT NL))
%x7D
protocol-method = protocol-rpc / protocol-event
protocol-rpc = %s"rpc" 1*SP IDENT *SP %x28 *SP type-name [ 1*SP %s"stream" ] *SP %x29 *SP %x3A *SP rpc-response
rpc-response = type-name / %x28 *SP [ type-name [ 1*SP %s"stream" ] *SP ] %x29
protocol-event = %s"event" 1*SP IDENT *SP %x28 *SP type-name [ 1*SP %s"stream" ] *SP %x29
protocol Greeter {
rpc Greet(GreetRequest stream): GreetResponse
}
protocol Greeter {
rpc Greet(GreetRequest): (GreetResponse stream)
}
protocol Greeter {
rpc Greet(GreetRequest stream): (GreetResponse stream)
}
protocol Greeter {
rpc Greet(GreetRequest): ()
}
protocol Greeter {
rpc Greet(GreetRequest stream): ()
}
protocol FileWatcher {
event SomethingHappened(SomeEventPayload)
}
Protocols, protocol RPCs, and protocol events may have options -- see the section on struct options for details on the syntax.
Supported options for all three productions are:
deprecated: bool
: whether the protocol / rpc / event is deprecated.