With autorex
, you can convert an automaton back into its regular expression
string representation. The translation is based on the state-elimination
algorithm which is neatly described in Sipser, Michael. Introduction to the
Theory of Computation. Vol. 2. Boston: Thomson Course Technology, 2006,
Chapter 1 (Starting from page 70). The implementation is based on the
dk.brics automaton package.
With dk.brics
, you can apply automata operations such as concatenation,
union, intersection, etc. on input automata to derive an output
automaton. Once a regular expression string has been converted into the
dk.brics
automaton representation, dk.brics
does not offer the functionality
to convert it back to a
regular expression String. This package provides tooling that converts
any dk.brics
output automaton back to its regular expression string representation.
autorex
is available on maven central. You can integrate it by using
the following dependency in the pom.xml
file. Note, that the maven releases
do not necessarily contain the newest changes that are available in the
repository. The maven releases are kept in sync with the tagged
releases. The API
documentation for every release is available.
here. However,
the content of this documentation, in particular the code examples and usage
scenarios, is always aligned with the master branch of this repository. Hence,
it might be that the latest autorex
features are not yet available through
the maven package.
<dependency>
<groupId>com.github.julianthome</groupId>
<artifactId>autorex</artifactId>
<version>1.0</version>
</dependency>
autorex
has a very simple API for state elimination
provided by the class autorex
. The following example gives an intuition how
autorex
can be used. After creating some dk.brics
automata, and after
performing a couple of operations on them, the result is stored in the d
automaton. Afterwards, we can simply invoke the getRegexFromAutomaton
method
on d
in order to get the corresponding regular expression string.
Automaton a = new RegExp("(abc)+[0-9]{1,3}[dg]*").toAutomaton();
Automaton b = new RegExp("12345678").toAutomaton();
Automaton c = new RegExp(".{0,5}").toAutomaton();
Automaton d = a.union(b).intersection(c); // output automaton
String s0 = Autorex.getRegexFromAutomaton(d); // obtain regular expression String
System.out.println("Regex String: " + s0);
autorex
will give you the following regular expression string which is
equivalent to the result of the computation (the automaton d
) in the example
above.
(((abc[0-9])([0-9])))|((abc[0-9])((g|d)|.{0}))
autorex
can also be used in order to transform a given automaton. At
the moment, we support the following transformation from a source automaton to
a target automaton:
- camel-case automatons that are case insensitive
- substring automatons that accept all the substring of the source automaton
- suffix automatons that accepts all the suffixes of a given source automaton
The following code snippet shows how the transformation API can be used:
String s = "hello my name is Alice";
Automaton a = new RegExp(s).toAutomaton();
Automaton ccas = Autorex.getCamelCaseAutomaton(a);
Automaton substr = Autorex.getSubstringAutomaton(a);
Automaton sfx = Autorex.getSuffixAutomaton(a);
For more examples, please have a look at the provided test cases or at the javadoc
documentation of the class autorex
.
The MIT License (MIT)
Copyright (c) 2016 Julian Thome [email protected]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.