forked from HypothesisWorks/hypothesis
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
223 lines (153 loc) · 7.16 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
================
Hypothesis
================
Hypothesis is a library for falsifying its namesake. It is inspired by libraries
for testing like `Quickcheck <http://en.wikipedia.org/wiki/QuickCheck>`_, with
its most direct ancestor being `ScalaCheck <https://github.com/rickynils/scalacheck>`_
from which it acquired its approach to test case minimization. It is not itself a
test framework, but it provides decorators for easy integration with test frameworks
The primary entry point into the library is the hypothesis.falsify method.
What does it do?
You give it a predicate and a specification for how to generate arguments to test
that predicate and it gives you a counterexample.
-----------------
Basic examples
-----------------
.. code:: python
In [1]: from hypothesis import falsify
In [2]: falsify(lambda x,y,z: (x + y) + z == x + (y +z), float,float,float)
Out[2]: (1.0, 1.0, 0.0387906318128606)
In [3]: falsify(lambda x: sum(x) < 100, [int])
Out[3]: ([6, 29, 65],)
In [4]: falsify(lambda x: sum(x) < 100, [int,float])
Out[4]: ([18.0, 82],)
In [5]: falsify(lambda x: "a" not in x, str)
Out[5]: ('a',)
In [6]: falsify(lambda x: "a" not in x, {str})
Out[6]: (set(['a']),)
Sometimes we ask it to falsify things that are true:
.. code:: python
In [7]: falsify(lambda x: x + 1 == 1 + x, int)
Unfalsifiable: Unable to falsify hypothesis <function <lambda> at 0x2efb1b8>
of course sometimes we ask it to falsify things that are false but hard to find:
.. code:: python
In [8]: falsify(lambda x: x != "I am the very model of a modern major general", str)
Unfalsifiable: Unable to falsify hypothesis <function <lambda> at 0x2efb398>
It's not magic, and when the search space is large it won't be able to do very much
for hard to find examples.
You can also use it to drive tests. I've only tested it with py.test, but it has no
specific dependencies on it: You just write normal tests which raise exceptions
on failures and it will transform those into randomized tests.
So the following test will pass:
.. code:: python
@given(int,int)
def test_int_addition_is_commutative(x,y):
assert x + y == y + x
And the following will fail:
.. code:: python
@given(str,str)
def test_str_addition_is_commutative(x,y):
assert x + y == y + x
With an error message something like:
.. code:: python
x = '0', y = '1'
@given(str,str)
def test_str_addition_is_commutative(x,y):
assert x + y == y + x
E assert '01' == '10'
E - 01
E + 10
------------------
Stateful testing
------------------
You can also use hypothesis for a more stateful style of testing, to generate
sequences of operations to break your code.
Considering the following broken implementation of a set:
.. code:: python
class BadSet:
def __init__(self):
self.data = []
def add(self, arg):
self.data.append(arg)
def remove(self, arg):
for i in xrange(0, len(self.data)):
if self.data[i] == arg:
del self.data[i]
break
def contains(self, arg):
return arg in self.data
Can we use hypothesis to demonstrate that it's broken? We can indeed!
We can put together a stateful test as follows:
.. code:: python
class BadSetTester(StatefulTest):
def __init__(self):
self.target = BadSet()
@step
@requires(int)
def add(self,i):
self.target.add(i)
assert self.target.contains(i)
@step
@requires(int)
def remove(self,i):
self.target.remove(i)
assert not self.target.contains(i)
The @step decorator says that this method is to be used as a test step.
The @requires decorator says what argument types it needs when it is
(you can omit @requires if you don't need any arguments).
We can now ask hypothesis for an example of this being broken:
.. code:: python
In [7]: BadSetTester.breaking_example()
Out[7]: (('add', 1), ('add', 1), ('remove', 1)]
What does this mean? It means that if we were to do:
.. code:: python
x = BadSetTester()
x.add(1)
x.add(1)
x.remove(1)
then we would get an assertion failure. Which indeed we would because the
assertion that removing results in the element no longer being in the set
would now be failing.
----------------
Under the hood
----------------
How does hypothesis work?
The core object of hypothesis is the SearchStrategy. It knows how to explore a
state space, and has the following operations.
* produce(size,flags). Generate a random element of the state space subject to the flags provided.
* flags(). Return a set of flags that may be used to control the production of elements.
* could_have_produced(element). Say whether it's plausible that this element was produced by this strategy.
* complexity(element). Return a float saying roughly how "complex" this element is. There's no meaning attached to this except that hypothesis will try to generate elements of lower complexity.
* simplify(element). Return a generator over a simplified versions of this element.
These satisfy the following invariants:
* produce(size,flags) should produce a distribution with about 'size' bits of entropy.
* Any element produced by produce must return true when passed to could_have_produced
* Any element for which could_have_produced returns true must not throw an exception when passed to complexity or simplify
* The expected complexity of produce(size) should be monotonic increasing in size
* for y in simplify(x), complexity(y) <= complexity(x)
* simplify(x) should return a sequence of unique values
* There shold be no chain x_1, x_2, ..., x_n with x_{i+1} in simplify(x_i) and x_1 in simplify(x_n).
These are used to explore the state space. produce is called with a number of sizes and flags to
generate examples that falsify the hypothesis. The lowest complexity of these examples is then
taken, then repeatedly simplified until an example is found with no simplification of it falsifying
the hypothesis. This is taken as the end result.
SearchStrategy objects are produced from a descriptor value (which can be anything) and a SearchStrategies
object, which has user definable rules for producing strategies.
So for example you can do
.. code:: python
In [35]: SearchStrategies().strategy((int,int,[str]))
Out[35]: TupleStrategy((int, int, [str]))
There are some reasonably complicated and subtle things you can do in terms of overriding
the defined search strategies. I'm not going to go into them here because it's all a bit weird
and likely to still be in flux.
Warning: This library is still very much in flux, and no release of it right now should be
considered to be stable. It's emerged out of the initial hack stage, and is probably not too
broken, but proceed with caution.
---------
Testing
---------
This version of hypothesis has been tested using Python series 2.7,
3.2, 3.3 and pypy. Builds are checked with `travis`_:
.. _travis: https://travis-ci.org/DRMacIver/hypothesis
.. image:: https://travis-ci.org/DRMacIver/hypothesis.png?branch=master
:target: https://travis-ci.org/DRMacIver/hypothesis