Typesafe XML for Scala
Xylophone is an XML library for Scala that takes advantage of many features of the language to provide intuitive syntax for manipulating XML, as well as better typesafety and static checks.
- parse and represent XML in Scala
- statically check XML in
x""
interpolators - substitute standard and custom types (provided by typeclasses) into XML
- automatically derive typeclasses to convert case classes and product types to and from XML
- safe dynamic interface for accessing nested fields
A Text
value containing XML may be parsed with,
Xml.parse(text)
which will return an instance of Xml
, or throw an XmlParseError
if the XML is not well-formed.
Xml
values may also be constructed using the x""
interpolator. These will be checked for well-formedness
at compiletime: all syntax must be valid, special characters escaped, and all tags must be closed and nested
correctly.
val book = x"<book><author>H. G. Wells</author><title>War of the Worlds</title></book>"
An Xml
value is a general type representing XML in three different forms, subtypes of Xml
:
XmlDoc
is a complete XML document, and includes the<?xml...
headerXmlNode
is a single XML nodeXmlFragment
is a fragment of XML, which may include zero, one or many XML nodes
While all three subtypes represent XML, there is some overlap in what may be represented by each, but important differences in their behavior (ex, and most Xylophone methods are careful to return precisely-typed values.
Furthermore, the Xml
subtypes serve as wrappers around several other AST node types, subtypes of the Ast
enumeration:
Element
Comment
ProcessingInstruction
Textual
CData
Root
Of these types, Element
and Root
include fields which may include sequences of other Ast
nodes, thereby
forming a tree structure.
Unlike some other data definition languages such as JSON, XML requires a distinction to be made between a
single node (XmlNode
) and a sequence of nodes (XmlFragment
). Multiple nodes with the same tag name
may exist as children of another node, while it is common for some variants of XML to use unique tag names
for every child node.
Both approaches are supported in Xylophone with very simple syntax.
For example, given the XML,
<library>
<book>
<author>H. G. Wells</author>
<title>The War of the Worlds</title>
</book>
<book>
<author>Virginia Woolf</author>
<title>Mrs. Dalloway</title>
</book>
</library>
as an instance of XmlNode
, library
, we can access the two book nodes with library.book
, as an XmlFragment
.
Subsequently calling, library.book.title
would return a new XmlFragment
consisting of the titles of both books,
specifically,
<title>The War of the Worlds</title>
<title>Mrs. Dalloway</title>
Given an XmlFragment
, the nth node in the sequence may be accessed by applying the node index, for example,
library.book.title(1)
would return,
<title>Mrs. Dalloway</title>
as an XmlNode
.
The same node could alternatively be accessed with library.book(1).title(0)
. Note how 0
is applied to the
title
now (instead of 1
) since we want the first (0
) title
element of the second (1
) 'book' element.
The application of (0)
returns a single XmlNode
instance rather than an XmlFragment
instance.
The apply
method of XmlFragment
has a default value of 0
for convenience in the very common case where the
first node is required, e.g. library.book().title()
An XmlFragment
of all the elements inside an XmlNode
can always be obtained by calling *
on the XmlNode
value. For example, library.book().*
would return an XmlFragment
of,
<title>The War of the Worlds</title>
<author>H. G. Wells</author>
An Xml
value is dynamic in the sense that it could represent a single string value or deeply-nested structured
data. Usually we want to convert Xml
values to other Scala types in order to use them. This can be achieved by
calling as[T]
on the value, for an appropriate choice of T
.
For example, the XmlNode
,
val name: XmlNode = x"<author>Virginia Woolf</author>"
could be converted to a Text
with, name.as[Text]
. Or,
val age: XmlNode = x"<age>18</age>"
can be read as a Long
with, age.as[Long]
.
This works for Text
s and primitive types. But will also work for case classes composed of these types (or
of other nested case classes). For example, given the definition,
case class Book(title: Text, author: Text)
a book from the library example above could be read with:
library.book().as[Book]
The as
method can also extract collection types (e.g. Set
, List
or Vector
) from an Xml
value, so
all the books in the library could be accessed with,
library.book.as[Set[Book]]
In general, extraction requires a contextual XmlReader
typeclass instance for the type to be extracted.
These exist on the XmlReader
companion object for the basic types, collection types, product types
(e.g. case classes) and coproduct types (e.g. enumerations), but other instances may be provided.
Likewise, these same types may be converted to Xml
by calling the xml
extension method on them, for
example, given,
case class Book(title: Text, author: Text)
val book = Book(t"Mrs. Dalloway", t"Virginia Woolf")
we could create an XmlNode
value of,
<Book>
<title>Mrs. Dalloway</title>
<author>Virginia Woolf</author>
</Book>
just by calling book.xml
.
Note that the element labels will be taken from the case class's type name and field names. However, for nested case classes, a type name will only appear in the XML output for the outermost tag name, since the field name will be used in these cases.
The type name will also appear in the repeated child nodes of XML produced from a collection type, for
example, writing the List[Int]
, List(1, 2, 3)
would produce the XML,
<List>
<Int>1</Int>
<Int>2</Int>
<Int>3</Int>
</List>
In general, the type name will be used for a node if the context does not suggest a more specific name.
The node nade may be controlled, however, using annotations. The @xmlLabel
attribute may be applied
to a case class or a case class field to change its name when written or read.
For example, the definition,
@xmlLabel(t"book")
case class Book(title: Text, author: Text, @xmlLabel(t"type") kind: Text)
would ensure that a Book
instance is written using the lower-case tag name, book
, and the kind
field would be serialized the name type
(which cannot be used so easily in Scala, as it's a keyword).
XML usually needs to be serialized to a string. Xylophone provides a show
method that will serialize
an Xml
value to a Text
value using a contextual XmlPrinter
, of which two are available by default:
one which omits all unnecessary whitespace, and one which "pretty prints" the XML with indentation for
nesting.
Xylophone is classified as fledgling. For reference, Soundness projects are categorized into one of the following five stability levels:
- embryonic: for experimental or demonstrative purposes only, without any guarantees of longevity
- fledgling: of proven utility, seeking contributions, but liable to significant redesigns
- maturescent: major design decisions broady settled, seeking probatory adoption and refinement
- dependable: production-ready, subject to controlled ongoing maintenance and enhancement; tagged as version
1.0.0
or later - adamantine: proven, reliable and production-ready, with no further breaking changes ever anticipated
Projects at any stability level, even embryonic projects, can still be used, as long as caution is taken to avoid a mismatch between the project's stability level and the required stability and maintainability of your own project.
Xylophone is designed to be small. Its entire source code currently consists of 658 lines of code.
Xylophone will ultimately be built by Fury, when it is published. In the meantime, two possibilities are offered, however they are acknowledged to be fragile, inadequately tested, and unsuitable for anything more than experimentation. They are provided only for the necessity of providing some answer to the question, "how can I try Xylophone?".
-
Copy the sources into your own project
Read the
fury
file in the repository root to understand Xylophone's build structure, dependencies and source location; the file format should be short and quite intuitive. Copy the sources into a source directory in your own project, then repeat (recursively) for each of the dependencies.The sources are compiled against the latest nightly release of Scala 3. There should be no problem to compile the project together with all of its dependencies in a single compilation.
-
Build with Wrath
Wrath is a bootstrapping script for building Xylophone and other projects in the absence of a fully-featured build tool. It is designed to read the
fury
file in the project directory, and produce a collection of JAR files which can be added to a classpath, by compiling the project and all of its dependencies, including the Scala compiler itself.Download the latest version of
wrath
, make it executable, and add it to your path, for example by copying it to/usr/local/bin/
.Clone this repository inside an empty directory, so that the build can safely make clones of repositories it depends on as peers of
xylophone
. Runwrath -F
in the repository root. This will download and compile the latest version of Scala, as well as all of Xylophone's dependencies.If the build was successful, the compiled JAR files can be found in the
.wrath/dist
directory.
Contributors to Xylophone are welcome and encouraged. New contributors may like to look for issues marked beginner.
We suggest that all contributors read the Contributing Guide to make the process of contributing to Xylophone easier.
Please do not contact project maintainers privately with questions unless there is a good reason to keep them private. While it can be tempting to repsond to such questions, private answers cannot be shared with a wider audience, and it can result in duplication of effort.
Xylophone was designed and developed by Jon Pretty, and commercial support and training on all aspects of Scala 3 is available from Propensive OÜ.
A xylophone is a musical instrument made from wood ("xylo-") or trees, and it provides a representation of XML trees. "Xylophone" and "XML" begin with the same infrequently-used letter.
In general, Soundness project names are always chosen with some rationale, however it is usually frivolous. Each name is chosen for more for its uniqueness and intrigue than its concision or catchiness, and there is no bias towards names with positive or "nice" meanings—since many of the libraries perform some quite unpleasant tasks.
Names should be English words, though many are obscure or archaic, and it should be noted how willingly English adopts foreign words. Names are generally of Greek or Latin origin, and have often arrived in English via a romance language.
The logo shows two angle brackets (or chevrons), representing the most significant symbols in XML, placed next to each other to look like a capital X.
Xylophone is copyright © 2024 Jon Pretty & Propensive OÜ, and is made available under the Apache 2.0 License.