Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String formulae #250

Merged
merged 59 commits into from
Nov 27, 2021
Merged

String formulae #250

merged 59 commits into from
Nov 27, 2021

Conversation

serras
Copy link
Contributor

@serras serras commented Aug 24, 2021

This PR adds initial support for strings, the only available operations are creation of constants and equality. A concrete implementation of the theory for Z3 is also provided.

Note: there are some attempts at implementing the theory for SMTInterpol. However, this comment suggests that the string theory is not really supported. For that reason, those changes have been reverted (but you may want to squash everything).

Copy link
Member

@kfriedberger kfriedberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this PR. Overall it is clean and simple.
It would be nice to merge after fixing a few small points:

  • the set of supported methods from String theory. Maybe we should add some more operations, like concat, length, and lessThan/greaterThan.
  • missing tests.

+ Native.sortToString(z3context, pSort)
+ " with sort "
+ sortKind);
if (Native.isStringSort(z3context, pSort)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an enum value for Z3_STRING_SORT? I could only find Z3_RE_SORT, is this the same sort and would match here, too?

@serras
Copy link
Contributor Author

serras commented Oct 29, 2021

@kfriedberger I've updated the PR with your suggestions (both in code and more tests), and I've gone all the way to also introduce regular expressions.

Please always apply 'ant format-diff' before committing to avoid such changes later.
There are several operations on Strings and Regexs
and the number of JUnit tests might increase.
…arameters.

This fulfills our Checkstyle criteria.
- more documentation
- reduced visibility,
- move solver-independent code into abstract superclass
CVC4 seems to support the full range of String and Regex operations, so JavaSMT can provide them.

We should add some more tests for String theory.
@kfriedberger
Copy link
Member

kfriedberger commented Nov 14, 2021

Current status of this PR:

  • Z3 and CVC4 support a sufficient range of features in String and Regex theory.

The following parts are still missing or incomplete or can be improved:

  • a proper definition of the method allChar:
    It is unclear whether SMT solvers support the full range of Unicode or just Ascii, as defined in SMTLIB2 (maybe with more escaped characters?). For the current SMT solvers we have the following situation:

    • Z3 supports Unicode ranges, but returns unknown when solving with full range.
    • CVC4 seems to only accept Ascii, and crashes with full Unicode range.

    I would be in favour of removing the method allChar completely. We already provide a way to specify ranges of chars via the API. This should be sufficient.

  • model evaluation: currently we have not yet any support and tests for model evaluation.

  • more tests, e.g. for String comparison via LT/LEQ/GT/GEQ, for Regex operations like OPTIONAL, etc.

  • optional: a way to translate a real Java regex into SMTLIB2 regex. Currently all regex formulas are build from basic parts from scratch.

  • optional: a nice example, maybe encode a regex crossword puzzle with JavaSMT :-)

  • conversion between Int and String theory.

  • more methods for String theory, e.g., replace, charAt, indexOf, contains.

@serras
Copy link
Contributor Author

serras commented Nov 17, 2021

I've removed allChar and added conversions from regular expression classes (Pattern and Matcher).

I don't know very well how to check the current model evaluation. As far as I understand, it should work on Z3, but I don't know how to add more tests in the corresponding file.

As a third question: what do you mean as conversion between the Int and String theories?

@kfriedberger
Copy link
Member

Hi @serras

  1. The new API methods for Java regex might not work as expected. They will just return a String that is matches "literally", but not "structurally". If implemented, this feature might need a way to reconstruct a regex based on basic elements from SMTLIB.
    I did not list the feature for now, but maybe as a future idea (sorry, should have added this as comment :-) ). This (if missing in the implementation) does not block the PR from merging.

  2. There seem to be methods in Z3 and CVC4 to convert an expression from Int theory to String theory and also back, if the String consists of digits). These operations should be similar to the existing conversion between Int to Bitvector theory. For example, this could allow a query like (= (itos (+ 1 2) (charAt "345" 0))).

Best.

baierd and others added 22 commits November 19, 2021 20:25
and simplification,
and formatting,
and fix JavaDoc.
Unicode with non-ascii chars remains an open problem for now.
…ern.

The only way to build up a regex is directly in JavaSMT from basic blocks.
The support to convert a real Java regex to SMTLIB regex is missing (this is no simple step!).
We should think about updating Z3,
because the new version 4.8.13 seems to provide several bugfixes.
See sosy-lab#249 for details.
@kfriedberger
Copy link
Member

With the support for formula visitation, this PR looks sufficient for a merge.
I will review the changes again in the afternoon and plan to merge then.

- extract  common constant.
- simplify replaceAll-test, cubic runtime in a test is too much :-).
…uantifiers.

There seems to be a bug in CVC4 that weakens its reasoning,
such that UNKNOWN/INCOMPLETE is returned from the SAT check.
As CVC4 is no longer actively maintained (there is already CVC5),
disabling the tests and having a basic support for String theory
is a quick and reasonable step.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants