Skip to content

Latest commit

 

History

History
362 lines (244 loc) · 20.4 KB

README.adoc

File metadata and controls

362 lines (244 loc) · 20.4 KB

mzTab 2.0 for Metabolomics Reader, Writer and Validator

Java CI with Maven Maven Central Latest Release DOI BioContainers Install with Conda

Note
If you use the jmzTab-M library or the web application, please cite the following paper:
N. Hoffmann et al., Analytical Chemistry 2019; Oct 15;91(20):12615-12618.. PubMed record.

This project is the reference reader, writer and validator implementation for mzTab for metabolomics 2.0+, based on the jmztab project.

mzTab-M is intended as a reporting standard for quantitative results from metabolomics/lipodomics approaches. This format is further intended to provide local LIMS systems as well as MS metabolomics repositories a simple way to share and combine basic information.

mzTab-M has been developed with a view to support the following general tasks:

  1. Facilitate the sharing of final experimental results, especially with researchers outside the field of metabolomics.

  2. Export of results to external software, including programs such as Microsoft Excel® and Open Office Spreadsheet and statistical software / coding languages such as R.

  3. Act as an output format of (web-) services that report MS-based results and thus can produce standardized result pages.

  4. Be able to link to the external experimental evidence e.g. by referencing back to mzML files.


  • The Maven site with JavaDoc is available here.

  • The latest aggregated, browseable JavaDoc is available here.

  • The latest version of the mzTab spec is available here.

  • The web-based validator application is available here.

  • The Swagger / OpenAPI object and operations model definition file is here.

  • The controlled vocabulary (CV) mapping file bundled with jmzTab-M is here.

  • Information on validation messages is available in Validation message templates and IDs.


Building the project and generating client code from the command-line

In order to build the client code and run the unit tests, execute the following command from a terminal:

./mvnw install

This generates the necessary domain specific code for a number of different languages, e.g. for Python, JAVA and R. In general, any language and framework for which swagger-codegen has templates available may be used. For details, please look at the api/pom.xml file and at the swagger-codegen repository and documentation.

The generated code is available in the zip archives below the api/target directory:

  1. mztab-api-python GitHub project

  2. mztab-api-r GitHub project

The generated JAVA code is available in a jar archive below the api/target directory:

  1. jmztabm-api-<VERSION>.jar

The generated client libraries only contain the basic domain object code. Parsing from mzTab and writing to mzTab still need to be implemented separately. The library provides this functionality for JAVA in the io sub-project.

Running a validation with the command-line interface

The cli sub-project provides a command line interface for validation of files using the mzTab 2.0 draft validator.

Note
Currently, we do not plan to provide a CLI-based validator for mzTab 1.0, since that is still being provided by the original jmzTab project.

After building the project as mentioned above with ./mvnw install, the cli/target folder will contain the cli-<version>-jmztabm-deployment.zip file. Alternatively, you can download the latest cli zip file from Maven central: Search for latest cli artefact and click on "bin.zip" to download.

In order to run the validator, unzip that file, change into the unzipped folder and run

java -jar jmztabm-cli-<VERSION>.jar

to see the available options.

To validate a particular file with a given level (one of: Info, Warn, Error, where Info includes Warn and Error), run

java -jar jmztabm-cli-<VERSION>.jar -c examples/MTBLS263.mztab -level Info

To run a semantic validation based on a controlled vocabulary term mapping file, run

java -jar jmztabm-cli-<VERSION>.jar -c examples/MTBLS263.mztab -level Info -s cv-mapping/mzTab-M-mapping.xml

Converting to JSON

If you want to exchange your mzTab-M model in JSON format, you can transcode your mzTab-M TSV file into JSON as follows:

java -jar jmztabm-cli-<VERSION>.jar -c examples/gcxgc-ms-example.mztab -toJson -s cv-mapping/mzTab-M-mapping.xml

This will create a new file on successful validation, named <inFile>.json (inFile without any paths) that contains your mzTab-M data serialized as JSON.

Note
Comments are retained within the comments part of the JSON document but not at their original locations.

Converting from JSON

If you only have a JSON model of your mzTab-M file available, you can convert it using the command line as follows:

java -jar jmztabm-cli-<VERSION>.jar -c examples/gcxgc-ms-example.mztab.json -fromJson -s cv-mapping/mzTab-M-mapping.xml

This will create a new file on successful validation, named <inFile>.mztab (inFile without any paths) that contains your mzTab-M data serialized in TSV format.

Note
Comment objects are currently discarded by the mzTab-M TSV serializer.

Running the Web Application for Validation

The validator web application code has been moved into a separate project: https://github.com/lifs-tools/jmzTab-m-webapp The application is available at: https://apps.lifs-tools.org/mztabvalidator

Using the Bioconda Package

jmztab-m is available from https://anaconda.org/bioconda/jmztab-m This also includes an automatically built Biocontainers Docker image that is available from https://biocontainers.pro/#/tools/jmztab-m

Alternatively, you can build and run a custom Docker image with the instructions in the following sections.

Running the Bioconda Docker Container

The Bioconda Docker container works slightly different from the one we provide instructions for further down. First, go to the quai.io page (user account required) to check for available tags: https://quay.io/repository/biocontainers/jmztab-m?tab=tags Then, run jmztab-m (the cli) as follows, mounting the local directory to '/home/data' within the container, with read and write permissions. The rest of the arguments follow those of the regular jmztab-m CLI (this example converts an input mzTab-M file into json, stores validation output in the file 'validation.txt', and applies semantic validation following the rules described in a custom mapping file):

docker run -v "${PWD}":/home/data:rw quay.io/biocontainers/jmztab-m:1.0.6--0 jmztab-m -c "/home/data/$i" -s --toJson -o "/home/data/validation.txt" -s mzTab_2_0-M_summary-mapping.xml

Building the Plain Docker Image

In order to build a Docker image of the command line interface application, run

./mvnw -Pdocker install

from your commandline (mvnw.bat on Windows). This will build and tag a Docker image lifs/jmztabm-cli with a corresponding version and make it available to your local Docker installation. To show the coordinates of the image, call

docker image ls | grep "jmztabm-cli"

Running the Plain Docker Image

If you have not done so, please build the Docker image of the validator cli or pull it from the docker hub (see previous sections). Then, run the following command, replacing <VERSION> with the current version, e.g. 1.0.6) and <DATA_DIR> with the local directory containing your mzTab-M files:

docker run -v <YOUR_DATA_DIR>:/home/data:rw lifs/jmztabm-cli:<VERSION>

This will only invoke the default entrypoint of the container, which is a shell script wrapper calling the jmztab-m-cli Jar. It passes all arguments to the validator, so that all arguments that you would pass normally will work in the same way (please replace <YOUR_MZTABM_FILE> with the actual file’s name in <YOUR_DATA_DIR>:

docker run -v <YOUR_DATA_DIR>:/home/data:rw lifs/jmztabm-cli:<VERSION> -c <YOUR_MZTABM_FILE>

Using the project code releases via Maven Central

Note
As of version 1.0.7, the groupId has changed from de.isas.mztab to org.lifs-tools to reflect the project’s GitHub organization.

The library release artifacts are available from Maven Central. If you want to use them, add the following lines to your own Maven pom file:

To use the IO libraries (reading, writing and structural and logical validation) in your own Maven projects, use the following dependency:

<dependency>
  <groupId>org.lifs-tools</groupId>
  <artifactId>jmztabm-io</artifactId>
  <version>${jmztabm.version}</version>
</dependency>

To use the semantic validation with the mapping file in your own Maven project, use the following dependency:

<dependency>
  <groupId>org.lifs-tools</groupId>
  <artifactId>jmztabm-validation</artifactId>
  <version>${jmztabm.version}</version>
</dependency>

where jmztab.version is the version of jmztabm you wish to use, e.g. for a release version:

<properties>
  <jmztabm.version>1.0.7</jmztabm.version>
</properties>

as defined in the properties section of your pom file.

Using development snapshots

The library development artifacts are available as SNAPSHOT (development versions) from Sonatype’s OSSRH repository. If you want to use them, add the following lines to your own Maven pom file:

<repositories>
  <repository>
    <name>Sonatype Snapshot Repository</name>
    <id>oss-sonatype-snapshots</id>
    <url>https://oss.sonatype.org/content/repositories/snapshots/</url>
    <snapshots>
      <enabled>true</enabled>
    </snapshots>
  </repository>
 ...
</repositories>

The project coordinates for the api module are

<dependency>
  <groupId>org.lifs-tools</groupId>
  <artifactId>jmztabm-api</artifactId>
  <version>${jmztabm.version}</version>
  <type>jar</type>
</dependency>

and

<dependency>
  <groupId>org.lifs-tools</groupId>
  <artifactId>jmztabm-io</artifactId>
  <version>${jmztabm.version}</version>
  <type>jar</type>
</dependency>

for the io module, where jmztab.version is the version of jmztabm you wish to use, e.g. for a SNAPSHOT version:

<properties>
  <jmztabm.version>1.0.7-SNAPSHOT</jmztabm.version>
</properties>

as defined in the properties section of your pom file.

Using the API programmatically

Reading mzTab 2.0 with structural and logical validation

Note
As of version 1.0.7, the package structure has been adapted to the new groupId. In your imports, please replace de.isas with org.lifstools

The following snippet will parse an mzTabFile from a file on disk:

import uk.ac.ebi.pride.jmztab2.*;
import uk.ac.ebi.pride.jmztab2.utils.*;
import uk.ac.ebi.pride.jmztab2.utils.errors.*;
import org.lifstools.mztab2.io.*;
import org.lifstools.mztab2.model.*;
...
File mzTabFile = new File("/path/to/my/file.mztab");
MzTabFileParser parser = new MzTabFileParser(mzTabFile);
//will report a maxmimum of 500 errors on Error, Warn and Info levels
//will output errors to System.err (onto your terminal)
parser.parse(System.err, MZTabErrorType.Level.Info, 500);
//inspect the output of the parse and errors
MZTabErrorList errors = parser.getErrorList();
//converting the MZTabErrorList into a list of ValidationMessage
List<ValidationMessage> messages = errors.convertToValidationMessages();
//access the file after parsing
MzTab mzTab = parser.getMZTabFile();

Creating an mzTab 2.0 object model

The mzTab domain model uses a builder pattern, but also conforms to the usual JAVA bean style pattern. The builder pattern allows for a more fluent definition of your object structure. However, especially for cross references with the file, you will need to define e.g. MsRun objects separately since inline referencing within the builder code will not work.

The following code will create the first parts of an mzTab-M file programmatically:

import org.lifstools.mztab2.model.*;
...
MzTab mztab = new MzTab();
Metadata mtd = new Metadata();
mtd.mzTabVersion("2.0.0-M");
mtd.mzTabID("1");
mtd.addSoftwareItem(new Software().id(1).
    parameter(new Parameter().id(1).
        name("LipidDataAnalyzer").
        value("2.6.3_nightly")));
MsRun msrun1 = new MsRun().id(1).
    location(
        "file://D:/Experiment1/Orbitrap_CID/negative/50/014_Ex1_Orbitrap_CID_neg_50.chrom");
mtd.addMsRunItem(msrun1);
Assay a1 = new Assay().id(1).
    addMsRunRefItem(msrun1);
Assay a2 = new Assay().id(2).
    addMsRunRefItem(msrun2);
mtd.addAssayItem(a1).addAssayItem(a2);
...

Adding optional columns

Writing mzTab 2.0 with validation

The following code writes an mzTab object structure to the provided file path, performing structural and logical validation:

MzTabValidatingWriter writer = new MzTabValidatingWriter();
File f = File.createTempFile(UUID.randomUUID().toString(), ".mztab");
Optional<List<ValidationMessage>> messages = writer.write(f.toPath(), mzTab);

You can also pass an OutputStreamWriter to the write method.

Writing mzTab 2.0 without validation

The following code writes an mzTab object structure to the provided output stream without any validation (use at your own risk):

MzTabNonValidatingWriter writer = new MzTabNonValidatingWriter();
try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
    try (OutputStreamWriter osw = new OutputStreamWriter(
        baos, Charset.forName("UTF8"))) {
        writer.write(osw, mzTab);
        osw.flush();
 String mzTabFileAsAString = osw.toString();
    }
}

Alternatively, you can also provide a File path to the write method.

Exploring the test suite

The use-cases that were described in the previous sections are also covered in the unit tests. Particularly, the following classes are of interest:

Validation message templates and IDs

The reference implementation uses message templates and IDs to uniquely identify each validation message. The catalogs of validation messages have been adapted and substantially extended from the previous reference implementation. The message catalogs can be found at the following locations:

Editing the Swagger Spec

This project defines the structure of an mzTab document based on JSON-Schema and Swagger https://swagger.io/.

Swagger provides many templates to generate client / server implementations based on a Swagger yaml or json definition.

This mechanism can be used to generate the domain-specific model classes in any of the supported languages, omitting the web-specific parts.

The Swagger editor can be used to import the file, edit it with assistance and preview, and export it after editing. It additionally supports the generation of server and client code to represent the mzTab object structure.

To launch the editor via Docker on Unix, use the script run-swagger-editor.sh in this directory.

The swagger API definition is in the following file: api/src/main/resources/mzTab_m_swagger.yml.

You can open it in the Swagger Editor via File → Import File. If you are done editing, go to File → Download YAML and save the file at the location of the mzTab_m_swagger.yml file, thereby replacing the original file.

You can create server and client code in a multitude of languages from the Generate Server and Generate Client menu items.

References

This project is the reference implementation for the mzTab-M 2.0 standard:

This project is based on and uses code that was developed for the original jmzTab project: