Skip to content

propensive/xylophone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Workflow

Xylophone

Typesafe XML for Scala

Xylophone is an XML library for Scala that takes advantage of many features of the language to provide intuitive syntax for manipulating XML, as well as better typesafety and static checks.

Features

  • parse and represent XML in Scala
  • statically check XML in x"" interpolators
  • substitute standard and custom types (provided by typeclasses) into XML
  • automatically derive typeclasses to convert case classes and product types to and from XML
  • safe dynamic interface for accessing nested fields

Availability

Getting Started

Parsing

A Text value containing XML may be parsed with,

Xml.parse(text)

which will return an instance of Xml, or throw an XmlParseError if the XML is not well-formed.

XML Literals

Xml values may also be constructed using the x"" interpolator. These will be checked for well-formedness at compiletime: all syntax must be valid, special characters escaped, and all tags must be closed and nested correctly.

val book = x"<book><author>H. G. Wells</author><title>War of the Worlds</title></book>"

XML AST Representation

An Xml value is a general type representing XML in three different forms, subtypes of Xml:

  • XmlDoc is a complete XML document, and includes the <?xml... header
  • XmlNode is a single XML node
  • XmlFragment is a fragment of XML, which may include zero, one or many XML nodes

While all three subtypes represent XML, there is some overlap in what may be represented by each, but important differences in their behavior (ex, and most Xylophone methods are careful to return precisely-typed values.

Furthermore, the Xml subtypes serve as wrappers around several other AST node types, subtypes of the Ast enumeration:

  • Element
  • Comment
  • ProcessingInstruction
  • Textual
  • CData
  • Root

Of these types, Element and Root include fields which may include sequences of other Ast nodes, thereby forming a tree structure.

Accessing elements

Unlike some other data definition languages such as JSON, XML requires a distinction to be made between a single node (XmlNode) and a sequence of nodes (XmlFragment). Multiple nodes with the same tag name may exist as children of another node, while it is common for some variants of XML to use unique tag names for every child node.

Both approaches are supported in Xylophone with very simple syntax.

For example, given the XML,

<library>
  <book>
    <author>H. G. Wells</author>
    <title>The War of the Worlds</title>
  </book>
  <book>
    <author>Virginia Woolf</author>
    <title>Mrs. Dalloway</title>
  </book>
</library>

as an instance of XmlNode, library, we can access the two book nodes with library.book, as an XmlFragment. Subsequently calling, library.book.title would return a new XmlFragment consisting of the titles of both books, specifically,

<title>The War of the Worlds</title>
<title>Mrs. Dalloway</title>

Given an XmlFragment, the nth node in the sequence may be accessed by applying the node index, for example,

library.book.title(1)

would return,

<title>Mrs. Dalloway</title>

as an XmlNode.

The same node could alternatively be accessed with library.book(1).title(0). Note how 0 is applied to the title now (instead of 1) since we want the first (0) title element of the second (1) 'book' element. The application of (0) returns a single XmlNode instance rather than an XmlFragment instance.

The apply method of XmlFragment has a default value of 0 for convenience in the very common case where the first node is required, e.g. library.book().title()

An XmlFragment of all the elements inside an XmlNode can always be obtained by calling * on the XmlNode value. For example, library.book().* would return an XmlFragment of,

<title>The War of the Worlds</title>
<author>H. G. Wells</author>

Extracting typed values

An Xml value is dynamic in the sense that it could represent a single string value or deeply-nested structured data. Usually we want to convert Xml values to other Scala types in order to use them. This can be achieved by calling as[T] on the value, for an appropriate choice of T.

For example, the XmlNode,

val name: XmlNode = x"<author>Virginia Woolf</author>"

could be converted to a Text with, name.as[Text]. Or,

val age: XmlNode = x"<age>18</age>"

can be read as a Long with, age.as[Long].

This works for Texts and primitive types. But will also work for case classes composed of these types (or of other nested case classes). For example, given the definition,

case class Book(title: Text, author: Text)

a book from the library example above could be read with:

library.book().as[Book]

The as method can also extract collection types (e.g. Set, List or Vector) from an Xml value, so all the books in the library could be accessed with,

library.book.as[Set[Book]]

In general, extraction requires a contextual XmlReader typeclass instance for the type to be extracted. These exist on the XmlReader companion object for the basic types, collection types, product types (e.g. case classes) and coproduct types (e.g. enumerations), but other instances may be provided.

Writing to XML

Likewise, these same types may be converted to Xml by calling the xml extension method on them, for example, given,

case class Book(title: Text, author: Text)
val book = Book(t"Mrs. Dalloway", t"Virginia Woolf")

we could create an XmlNode value of,

<Book>
  <title>Mrs. Dalloway</title>
  <author>Virginia Woolf</author>
</Book>

just by calling book.xml.

Note that the element labels will be taken from the case class's type name and field names. However, for nested case classes, a type name will only appear in the XML output for the outermost tag name, since the field name will be used in these cases.

The type name will also appear in the repeated child nodes of XML produced from a collection type, for example, writing the List[Int], List(1, 2, 3) would produce the XML,

<List>
  <Int>1</Int>
  <Int>2</Int>
  <Int>3</Int>
</List>

In general, the type name will be used for a node if the context does not suggest a more specific name.

The node nade may be controlled, however, using annotations. The @xmlLabel attribute may be applied to a case class or a case class field to change its name when written or read.

For example, the definition,

@xmlLabel(t"book")
case class Book(title: Text, author: Text, @xmlLabel(t"type") kind: Text)

would ensure that a Book instance is written using the lower-case tag name, book, and the kind field would be serialized the name type (which cannot be used so easily in Scala, as it's a keyword).

Serialization

XML usually needs to be serialized to a string. Xylophone provides a show method that will serialize an Xml value to a Text value using a contextual XmlPrinter, of which two are available by default: one which omits all unnecessary whitespace, and one which "pretty prints" the XML with indentation for nesting.

Status

Xylophone is classified as fledgling. For reference, Soundness projects are categorized into one of the following five stability levels:

  • embryonic: for experimental or demonstrative purposes only, without any guarantees of longevity
  • fledgling: of proven utility, seeking contributions, but liable to significant redesigns
  • maturescent: major design decisions broady settled, seeking probatory adoption and refinement
  • dependable: production-ready, subject to controlled ongoing maintenance and enhancement; tagged as version 1.0.0 or later
  • adamantine: proven, reliable and production-ready, with no further breaking changes ever anticipated

Projects at any stability level, even embryonic projects, can still be used, as long as caution is taken to avoid a mismatch between the project's stability level and the required stability and maintainability of your own project.

Xylophone is designed to be small. Its entire source code currently consists of 658 lines of code.

Building

Xylophone will ultimately be built by Fury, when it is published. In the meantime, two possibilities are offered, however they are acknowledged to be fragile, inadequately tested, and unsuitable for anything more than experimentation. They are provided only for the necessity of providing some answer to the question, "how can I try Xylophone?".

  1. Copy the sources into your own project

    Read the fury file in the repository root to understand Xylophone's build structure, dependencies and source location; the file format should be short and quite intuitive. Copy the sources into a source directory in your own project, then repeat (recursively) for each of the dependencies.

    The sources are compiled against the latest nightly release of Scala 3. There should be no problem to compile the project together with all of its dependencies in a single compilation.

  2. Build with Wrath

    Wrath is a bootstrapping script for building Xylophone and other projects in the absence of a fully-featured build tool. It is designed to read the fury file in the project directory, and produce a collection of JAR files which can be added to a classpath, by compiling the project and all of its dependencies, including the Scala compiler itself.

    Download the latest version of wrath, make it executable, and add it to your path, for example by copying it to /usr/local/bin/.

    Clone this repository inside an empty directory, so that the build can safely make clones of repositories it depends on as peers of xylophone. Run wrath -F in the repository root. This will download and compile the latest version of Scala, as well as all of Xylophone's dependencies.

    If the build was successful, the compiled JAR files can be found in the .wrath/dist directory.

Contributing

Contributors to Xylophone are welcome and encouraged. New contributors may like to look for issues marked beginner.

We suggest that all contributors read the Contributing Guide to make the process of contributing to Xylophone easier.

Please do not contact project maintainers privately with questions unless there is a good reason to keep them private. While it can be tempting to repsond to such questions, private answers cannot be shared with a wider audience, and it can result in duplication of effort.

Author

Xylophone was designed and developed by Jon Pretty, and commercial support and training on all aspects of Scala 3 is available from Propensive OÜ.

Name

A xylophone is a musical instrument made from wood ("xylo-") or trees, and it provides a representation of XML trees. "Xylophone" and "XML" begin with the same infrequently-used letter.

In general, Soundness project names are always chosen with some rationale, however it is usually frivolous. Each name is chosen for more for its uniqueness and intrigue than its concision or catchiness, and there is no bias towards names with positive or "nice" meanings—since many of the libraries perform some quite unpleasant tasks.

Names should be English words, though many are obscure or archaic, and it should be noted how willingly English adopts foreign words. Names are generally of Greek or Latin origin, and have often arrived in English via a romance language.

Logo

The logo shows two angle brackets (or chevrons), representing the most significant symbols in XML, placed next to each other to look like a capital X.

License

Xylophone is copyright © 2024 Jon Pretty & Propensive OÜ, and is made available under the Apache 2.0 License.