Skip to content

smooks/smooks-fixed-length-cartridge

Repository files navigation

Smooks Fixed-Length Cartridge

Maven Central Sonatype Nexus (Snapshots) Build Status

A simple/basic fixed-length reader configuration:

smooks-config.xml
<?xml version="1.0"?>
<smooks-resource-list xmlns="https://www.smooks.org/xsd/smooks-2.0.xsd"
                      xmlns:fl="https://www.smooks.org/xsd/smooks/fixed-length-1.4.xsd">

    <!-- Configure the fixed-length reader to parse the message into a stream of SAX events. -->
    <fl:reader fields="firstname[10],lastname[10],gender[1],age[2],country[2]" skipLines="1" />

</smooks-resource-list>

Example input file:

#HEADER
Tom       Fennelly  M 21 IE
Maurice  Zeijen     M 27 NL

The above configuration will generate an event stream of the form:

<set>
    <record>
        <firstname>Tom       </firstname>
        <lastname>Fennelly  </lastname>
        <gender>M</gender>
        <age> 21</age>
        <country>IE</country>
    </record>
    <record>
        <firstname>Maurice  </firstname>
        <lastname>Zeijen     </lastname>
        <gender>M</gender>
        <age>27</age>
        <country>NL</country>
    </record>
</set>

Defining fields

Fields are defined in the 'fields' attribute as a comma separated list of names and field lengths. The field lengths must be defined between the brackets after the field name (see the example above).

The field names must follow the same naming rules like XML element names:

  • Names can contain letters, numbers, and other characters

  • Names cannot start with a number or punctuation character

  • Names cannot start with the letters xml (or XML, or Xml, etc…​)

  • Names cannot contain spaces

By setting the rootElementName and recordElementName attributes you can modify the and element names. The same naming rules apply for these names.

String manipulation functions

String manipulation functions can be defined per field. These functions are executed before that the data is converted into SAX events. The functions are defined after the field length definitions and are optionally separated with a question mark.

<?xml version="1.0"?>
<smooks-resource-list xmlns="https://www.smooks.org/xsd/smooks-2.0.xsd"
                      xmlns:fl="https://www.smooks.org/xsd/smooks/fixed-length-1.4.xsd">

    <!-- Configure the fixed-length reader to parse the message into a stream of SAX events.  -->
    <fl:reader fields="firstname[10]?trim,lastname[10]trim.capitalize,gender[1],age[2],country[2]" skipLines="1" />

</smooks-resource-list>

The following functions are available:

  • upper_case: Returns the upper case version of the string.

  • lower_case: Returns the lower case version of the string.

  • cap_first: Returns the string with the very first word of the string capitalized.

  • uncap_first: Returns the string with the very first word of the string un-capitalized. The opposite to cap_first.

  • capitalize: Returns the string with all words capitalized.

  • trim: Returns the string without leading and trailing white-spaces.

  • left_trim: Returns the string without leading white-spaces.

  • right_trim: Returns the string without trailing white-spaces.

Functions can be chained via the point separator: trim.upper_case. It depends on the reader how the functions are defined per field. Please look at the individual chapters of the readers for that information.

Ignoring Fields

Characters ranges of a fixed-length record can be ignored by specifying the $ignore$[10] token in the fields configuration value. You must specify the number of characters that need be ignored, just as a normal field.

smooks-config.xml
<?xml version="1.0"?>
<smooks-resource-list xmlns="https://www.smooks.org/xsd/smooks-2.0.xsd"
                      xmlns:fl="https://www.smooks.org/xsd/smooks/fixed-length-1.4.xsd">

    <fl:reader fields="firstname,$ignore$[2],age,$ignore$[10]" />

</smooks-resource-list>

Binding Fixed-Length Records to Java

Smooks v1.2 has added support for making the binding of fixed-length records to Java Objects a very trivial task. You don’t need to use the Javabean Cartridge directly (i.e. Smooks main Java binding functionality).

A Persons fixed-length record set such as:

Tom       Fennelly  M 21 IE
Maurice  Zeijen     M 27 NL

Can be bound to a Person of (no getters/setters):

Person.java
public class Person {
    private String firstname;
    private String lastname;
    private String country;
    private String gender;
    private int age;
}

Using a config of the form:

smooks-config.xml
<?xml version="1.0"?>
<smooks-resource-list xmlns="https://www.smooks.org/xsd/smooks-2.0.xsd"
                      xmlns:fl="https://www.smooks.org/xsd/smooks/fixed-length-1.4.xsd">

    <fl:reader fields="firstname[10]?trim,lastname[10]?trim,gende[1],age[3]?trim,country[2]">
        <!-- Note how the field names match the property names on the Person class. -->
        <fl:listBinding beanId="people" class="org.smooks.fixedlength.Person" />
    </fl:reader>

</smooks-resource-list>

To execute this configuration:

Smooks smooks = new Smooks(configStream);
JavaSink sink = new JavaSink();

smooks.filterSource(new StreamSource(fixedLengthStream), sink);

List<Person> people = (List<Person>) sink.getBean("people");

Smooks also supports creation of maps from the fixed-length record set:

smooks-config.xml
<?xml version="1.0"?>
<smooks-resource-list xmlns="https://www.smooks.org/xsd/smooks-2.0.xsd"
                      xmlns:csv="https://www.smooks.org/xsd/smooks/csv-1.7.xsd">

    <csv:reader fields="firstname,lastname,gender,age,country">
        <csv:mapBinding beanId="people" class="org.smooks.csv.Person" keyField="firstname" />
    </csv:reader>

</smooks-resource-list>

The above configuration would produce a map of Person instances, keyed by the "firstname" value of each Person. It would be executed as follows:

Smooks smooks = new Smooks(configStream);
JavaSink sink = new JavaSink();

smooks.filterSource(new StreamSource(fixedLengthStream), sink);

Map<String, Person> people = (Map<String, Person>) sink.getBean("people");

Person tom = people.get("Tom");
Person mike = people.get("Maurice");

Virtual Models are also supported, so you can define the class attribute as a java.util.Map and have the fixed-length field values bound into Map instances, which are in turn added to a list or a map.

Java API

Programmatically configuring the FixedLengthReader on a Smooks instance is trivial. A number of options are available.

Configuring Directly on the Smooks Instance

The following code configures a Smooks instance with a FixedLengthReader for reading a people record set (see above), binding the record set into a List of Person instances:

Smooks smooks = new Smooks();

smooks.setReaderConfig(new FixedLengthReaderConfigurator("firstname,lastname,gender,age,country")
                  .setBinding(new FixedLengthBinding("people", Person.class, CSVBindingType.LIST)));

JavaSink sink = new JavaSink();
smooks.filterSource(new ReaderSource(csvReader), sink);

List<Person> people = (List<Person>) sink.getBean("people");

Of course configuring the Java Binding is totally optional. The Smooks instance could instead (or in conjunction with) be programmatically configured with other visitor for carrying out other forms of processing on the fixed-length record set.

Fixed-Length List & Map Binders

If you’re just interested in binding fixed-length records directly onto a List or Map of a Java type that reflects the data in your fixed-length records, then you can use the FixedLengthListBinder or FixedLengthMapBinder classes.

FixedLengthListBinder:

// Note: The binder instance should be cached and reused...
FixedLengthListBinder binder = new FixedLengthListBinder("firstname[10]?trim,lastname[10]?trim,gender[1],age[3]?trim,country[2]", Person.class);

List<Person> people = binder.bind(fixedLengthStream);

FixedLengthMapBinder:

// Note: The binder instance should be cached and reused...
FixedLengthMapBinder binder = new FixedLengthMapBinder("firstname[10]?trim,lastname[10]?trim,gender[1],age[3]?trim,country[2]", Person.class, "firstname");

Map<String, Person> people = binder.bind(fixedLengthStream);

If you need more control over the binding process, revert back to the lower level APIs:

Maven Coordinates

pom.xml
<dependency>
    <groupId>org.smooks.cartridges</groupId>
    <artifactId>smooks-fixed-length-cartridge</artifactId>
    <version>2.0.1</version>
</dependency>

XML Namespace

xmlns:fl="https://www.smooks.org/xsd/smooks/fixed-length-1.4.xsd"

License

Smooks Fixed-Length Cartridge is open source and licensed under the terms of the Apache License Version 2.0, or the GNU Lesser General Public License version 3.0 or later. You may use Smooks Fixed-Length Cartridge according to either of these licenses as is most appropriate for your project.

SPDX-License-Identifier: Apache-2.0 OR LGPL-3.0-or-later