README

================
 Hypothesis
================

Hypothesis is a library for falsifying its namesake. It is inspired by libraries
for testing like `Quickcheck <http://en.wikipedia.org/wiki/QuickCheck>`_, with
its most direct ancestor being `ScalaCheck <https://github.com/rickynils/scalacheck>`_
from which it acquired its approach to test case minimization. It is not itself a
test framework, but it provides decorators for easy integration with test frameworks

The primary entry point into the library is the hypothesis.falsify method.

What does it do?

You give it a predicate and a specification for how to generate arguments to test
that predicate and it gives you a counterexample.

-----------------
 Basic examples
-----------------

.. code:: python

    In [1]: from hypothesis import falsify

    In [2]: falsify(lambda x,y,z: (x + y) + z == x + (y +z), float,float,float)
    Out[2]: (1.0, 1.0, 0.0387906318128606)

    In [3]: falsify(lambda x: sum(x) < 100, [int])
    Out[3]: ([6, 29, 65],)

    In [4]: falsify(lambda x: sum(x) < 100, [int,float])
    Out[4]: ([18.0, 82],)

    In [5]: falsify(lambda x: "a" not in x, str)
    Out[5]: ('a',)

    In [6]: falsify(lambda x: "a" not in x, {str})
    Out[6]: (set(['a']),)

Sometimes we ask it to falsify things that are true:

.. code:: python

    In [7]: falsify(lambda x: x + 1 == 1 + x, int)
    Unfalsifiable: Unable to falsify hypothesis <function <lambda> at 0x2efb1b8>

of course sometimes we ask it to falsify things that are false but hard to find:

.. code:: python

    In [8]: falsify(lambda x: x != "I am the very model of a modern major general", str)
    Unfalsifiable: Unable to falsify hypothesis <function <lambda> at 0x2efb398>

It's not magic, and when the search space is large it won't be able to do very much
for hard to find examples.

You can also use it to drive tests. I've only tested it with py.test, but it has no 
specific dependencies on it: You just write normal tests which raise exceptions 
on failures and it will transform those into randomized tests.

So the following test will pass:

.. code:: python

    @given(int,int)
    def test_int_addition_is_commutative(x,y):
        assert x + y == y + x

And the following will fail:

.. code:: python

    @given(str,str)
    def test_str_addition_is_commutative(x,y):
        assert x + y == y + x

With an error message something like:
 
.. code:: python

        x = '0', y = '1'
        @given(str,str)
        def test_str_addition_is_commutative(x,y):
            assert x + y == y + x
    E       assert '01' == '10'
    E         - 01
    E         + 10


------------------
 Stateful testing
------------------

You can also use hypothesis for a more stateful style of testing, to generate
sequences of operations to break your code.

Considering the following broken implementation of a set:

.. code:: python

    class BadSet:
        def __init__(self):
            self.data = []

        def add(self, arg):
            self.data.append(arg)

        def remove(self, arg):
            for i in xrange(0, len(self.data)):
                if self.data[i] == arg:
                    del self.data[i]
                    break

        def contains(self, arg):
            return arg in self.data

Can we use hypothesis to demonstrate that it's broken? We can indeed!

We can put together a stateful test as follows:

.. code:: python

    class BadSetTester(StatefulTest):
        def __init__(self):
            self.target = BadSet()

        @step
        @requires(int)
        def add(self,i):
            self.target.add(i)
            assert self.target.contains(i)

        @step
        @requires(int)
        def remove(self,i):
            self.target.remove(i)
            assert not self.target.contains(i)

The @step decorator says that this method is to be used as a test step.
The @requires decorator says what argument types it needs when it is 
(you can omit @requires if you don't need any arguments).

We can now ask hypothesis for an example of this being broken:

.. code:: python

    In [7]: BadSetTester.breaking_example()
    Out[7]: (('add', 1), ('add', 1), ('remove', 1)]

What does this mean? It means that if we were to do:

.. code:: python

    x = BadSetTester()
    x.add(1)
    x.add(1)
    x.remove(1)

then we would get an assertion failure. Which indeed we would because the
assertion that removing results in the element no longer being in the set
would now be failing.

----------------
 Under the hood
----------------

How does hypothesis work?

The core object of hypothesis is the SearchStrategy. It knows how to explore a 
state space, and has the following operations.

* produce(size,flags). Generate a random element of the state space subject to the flags provided.
* flags(). Return a set of flags that may be used to control the production of elements.
* could_have_produced(element). Say whether it's plausible that this element was produced by this strategy.
* complexity(element). Return a float saying roughly how "complex" this element is. There's no meaning attached to this except that hypothesis will try to generate elements of lower complexity.
* simplify(element). Return a generator over a simplified versions of this element.

These satisfy the following invariants:

* produce(size,flags) should produce a distribution with about 'size' bits of entropy.
* Any element produced by produce must return true when passed to could_have_produced
* Any element for which could_have_produced returns true must not throw an exception when passed to complexity or simplify
* The expected complexity of produce(size) should be monotonic increasing in size
* for y in simplify(x), complexity(y) <= complexity(x)
* simplify(x) should return a sequence of unique values
* There shold be no chain x_1, x_2, ..., x_n with x_{i+1} in simplify(x_i) and x_1 in simplify(x_n). 

These are used to explore the state space. produce is called with a number of sizes and flags to
generate examples that falsify the hypothesis. The lowest complexity of these examples is then
taken, then repeatedly simplified until an example is found with no simplification of it falsifying
the hypothesis. This is taken as the end result.

SearchStrategy objects are produced from a descriptor value (which can be anything) and a SearchStrategies 
object, which has user definable rules for producing strategies.

So for example you can do

.. code:: python

    In [35]: SearchStrategies().strategy((int,int,[str]))
    Out[35]: TupleStrategy((int, int, [str]))

There are some reasonably complicated and subtle things you can do in terms of overriding 
the defined search strategies. I'm not going to go into them here because it's all a bit weird
and likely to still be in flux.

Warning: This library is still very much in flux, and no release of it right now should be
considered to be stable. It's emerged out of the initial hack stage, and is probably not too
broken, but proceed with caution.

---------
 Testing
---------

This version of hypothesis has been tested using Python series 2.7,
3.2, 3.3 and pypy.  Builds are checked with `travis`_:

.. _travis: https://travis-ci.org/DRMacIver/hypothesis

.. image:: https://travis-ci.org/DRMacIver/hypothesis.png?branch=master
   :target: https://travis-ci.org/DRMacIver/hypothesis