-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: create a type of generator supporting recursive values #64
Comments
Just following up on this. Does it seem like the sort of thing that would be interesting? |
Can you provide some examples (in Java) of what you have in mind? |
I'm wanting to generate tree-like data-structures, and thinking there's no easy way to do so (I hope this isn't based on a false assumption). So for example, any of the following structures:
The rough idea would be an API something like:
What this would give you is a tree-like structure, but without having to manually construct the tree from some simpler data, or manually manage how large/small the tree is (though an API should probably give you the option to pass min/max sizes). The case where a Node can have two children does run into a little difficulty with type inference:
However, since posting my earlier comments, I've realized that I'm not sure how to handle an object that requires a collection of its own type. What's below typechecks, but doesn't make sense: you really want the implementation of recurse to somehow control the size of the lists returned.
|
I hope I can shine some light on the subject of recursive data structures. Usefulness of recursive data structuresTo answer the original question - Yes, I think being able to generate recursive data structures can be really useful (or essential), depending on the thing you want to test. Some examples are mathematical expressions, div-elements in HTML, program code AST, arbitrary JSON-data, graphs (recursive trees with some extra edges added to them) and so on. Size of recursive data structuresRegarding the size of the recursive data structure, how about not specifying the structure size in absolute terms, but just in likely-hood of actually terminating the structure? A key point of Property Based Testing is to let the generated example to surprise you, right :) Let say you have a type Tree, that can either be a terminating Leaf or a BinaryNode. Further let the BinaryNode contain two sub-trees of type Tree (the recursiveness). If there is equal 50-50 chance of a Tree being either a Leaf or a BinaryNode some generated trees will never terminate, since at least one of the BinaryNode sub-trees is yet another BinaryNode:
If we instead let there be a slight overweight of e.g. 51% chance of a tree being a Leaf, the Tree will probability-wise eventually terminate. So, if my math is in order, with the 50% change of each of the sub-trees in a BinaryNode to be yet a BinaryNode, the probability of a BinaryNode to have depth >= 1000 is:
With only a 49% chance of a Tree to be a BinaryNode, probability of a BonaryNode of depth >= 1000 is insignificant:
Termination problem solved! :-D So how do you do recursive data structures?Recursive data structures are already possible to construct with the current versions of QuickTheories, but the QuickTheories API can surely be improved. The hard part is to be able to reference a generator from itself, without causing a stack overflow. In the following example, some indirection-wizardry is used to break the recursive stack calls: import java.util.function.BiFunction;
import java.util.function.Supplier;
import org.junit.Ignore;
import org.junit.Test;
import org.quicktheories.WithQuickTheories;
import org.quicktheories.core.Gen;
import org.quicktheories.core.RandomnessSource;
public class RecursiveMathExpressionTest implements WithQuickTheories {
// --- Classes under test ---
/** A mathematical expression */
static interface Expression {
double calculate();
}
static class Litteral implements Expression {
private long value;
Litteral(long value) {
this.value = value;
}
@Override
public double calculate() {
return value;
}
@Override
public String toString() {
return "" + value;
}
}
static enum Operator {
Add("+", (a, b) -> a + b), Sub("-", (a, b) -> a - b), Mult("*", (a, b) -> a * b), Div("/", (a, b) -> a / b);
public final BiFunction<Double, Double, Double> fn;
public final String symbol;
Operator(String symbol, BiFunction<Double, Double, Double> fn) {
this.symbol = symbol;
this.fn = fn;
}
}
static class BinaryExpression implements Expression {
private Operator op;
private Expression left;
private Expression right;
BinaryExpression(Operator op, Expression left, Expression right) {
this.op = op;
this.left = left;
this.right = right;
}
@Override
public double calculate() {
return op.fn.apply(left.calculate(), right.calculate());
}
@Override
public String toString() {
return "(" + left + " " + op.symbol + " " + right + ")";
}
}
// --- Generators ---
public Gen<Expression> litterals() {
return longs()
.all()
.map(Litteral::new);
}
public Gen<Expression> operators(Gen<Expression> subExprGen) {
return arbitrary()
.enumValues(Operator.class)
.zip(subExprGen, subExprGen, BinaryExpression::new);
}
public Gen<Expression> stackOverflowExpressions() {
// Will blow the stack when recursively evaluating
// stackOverflowExpressions().
return litterals()
.mix(operators(stackOverflowExpressions()), 49);
}
// This class member acts as a place holder for
// a (soon to be) existing generator.
// A question: Is it sound practice to reuse the same generator instance in
// multiple tests, or should a new generator instance be used in each test?
Gen<Expression> expressionsPlaceHolder = new LazyGen<>(
() -> expressionsBuilder());
private Gen<Expression> expressionsBuilder() {
// No referral to itself. Only delegates to the place holder.
return litterals()
.mix(operators(expressionsPlaceHolder), 49);
}
/** The class to bake into the QuickTheories API */
private class LazyGen<T> implements Gen<T> {
private Supplier<Gen<T>> genereratorSupplier;
LazyGen(Supplier<Gen<T>> generatorSupplier) {
this.genereratorSupplier = generatorSupplier;
}
@Override
public T generate(RandomnessSource in) {
// Get the generator
Gen<T> generator = genereratorSupplier.get();
// Re-define the supplier to only return the calculated
// generator next time called.
genereratorSupplier = () -> generator;
// Use the inner generator
return generator.generate(in);
}
}
// --- The tests ---
@Test
/**
* Will successfully generate recursive values, but for some reason never fail
* the check. Should fail on e.g. the expression (1 / 0).
*/
public void allValuesAreFinite_Successfull() {
qt()
.forAll(expressionsPlaceHolder)
.check(expression -> Double.isFinite(expression.calculate()));
}
@Test
@Ignore
/** Retrieving the generator will cause stack overflow */
public void allValuesAreFinite_StackOverflow() {
qt()
.forAll(stackOverflowExpressions())
.check(expression -> Double.isFinite(expression.calculate()));
}
} |
See pull request #65 |
@hyperpape @jockbert Thanks for this, sorry I've not been very responsive. Yes, it would be good to get this into the api. I'm afraid I don't have time to properly look at the PR just now, but from a quick scan the lazy initialization of the supplier is making me slightly nervous. This may (or may not) have implications for the sharing and re-use of generators. |
@hcoles Noted. Would perhaps be good if someone else tried to poke holes in the suggested solution. @hyperpape You talked about being able to define the size of the generated structure. If it would help the API user, a possible added version can provide an integer (starting at value zero or one) as recursive depth level indicator in the <T> Gen<T> recursive(BiFunction<Gen<T>,Integer,Gen<T>> recursiveGeneratorDefinition) However, here is an example on how current suggestion can be used for the earlier mentioned package com.example;
import java.util.Arrays;
import java.util.List;
import org.junit.Test;
import org.quicktheories.WithQuickTheories;
import org.quicktheories.core.Gen;
public class RecursiveParentExample implements WithQuickTheories {
/** Class under test */
static class Parent {
String name;
List<Parent> children;
private Parent(List<Parent> children, String name) {
this.name = name;
this.children = children;
}
@Override
public String toString() {
return toStringWithIndent(0);
}
/** To string helper method, for nicer print outs */
private String toStringWithIndent(int level) {
String result = "";
while (result.length() < level) {
result += "-";
}
result += " " + name;
for (Parent child : children) {
result += "\n" + child.toStringWithIndent(level + 1);
}
return result;
}
}
/** Generator of empty lists */
Gen<List<Parent>> noChildren() {
return arbitrary().constant(Arrays.<Parent>asList());
}
/** Some different names */
Gen<String> names() {
return arbitrary().pick("Adam", "Bertha", "Caesar", "Dorothy", "Edgar", "Filippa");
}
/** Recursive parent generator. */
Gen<Parent> parents() {
return arbitrary().recursive(children -> {
Gen<List<Parent>> childLists =
// Average branching factor is 3 for an evenly distributed range [1,5].
lists().of(children).ofSizeBetween(1, 5)
// ...therefore we must terminate 68% > 2/3 of the times
// in order to to have a slight termination overweight.
// Branch factor 3 multiplied with probability 0.32 equals 0.96 < 1.00
.mix(noChildren(), 68);
return childLists.zip(names(), Parent::new);
});
}
@Test
public void thereAreNoParentsNamedCeasarWithThreeChildren() throws Exception {
qt()
.forAll(parents())
.check(tree -> !hasNameAndChildCount("Caesar", 3, tree));
}
boolean hasNameAndChildCount(String name, int childCount, Parent root) {
boolean isRootAMatch = root.name.equals(name) && root.children.size() == childCount;
return isRootAMatch || root.children.stream()
.anyMatch(child -> hasNameAndChildCount(name, childCount, child));
}
} |
I'll try to have a look, but it might be a few days. |
@hcoles I looked at the lazy initialization. If I am reading it correctly, it's not thread-safe (though that's fixable), but otherwise shouldn't have any visible effects on the behavior of the generator. |
Currently, there's no built in support for recursive data types. I'm working on something that would benefit from it. Would such a thing be of interest?
If so, I might take a shot at an implementation. I haven't started thought much about the implementation, but I looked at the implementation in Hypothesis (https://hypothesis.works/articles/recursive-data/, https://github.com/HypothesisWorks/hypothesis/blob/master/hypothesis-python/src/hypothesis/searchstrategy/recursive.py), and I think a similar interface might work.
The text was updated successfully, but these errors were encountered: