-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.html
418 lines (418 loc) · 47.2 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
<p><a href="https://doi.org/10.5281/zenodo.7342082"><img
src="https://zenodo.org/badge/DOI/10.5281/zenodo.7342082.svg"
alt="DOI" /></a></p>
<h1 id="signatr-artifact">Signatr Artifact</h1>
<p><strong>We also provide pdf and html versions of this README. If
reading locally and not on <a
href="https://github.com/PRL-PRG/sle22-signatr-artifact">github</a>, we
advise to use the html version.</strong></p>
<p>The artifact contains the <code>signatr</code> tool, and the
pipelines to create an R value database and to fuzz R functions with the
database to find type signatures. The pipeline to create a value
database is in <code>pipeline-dbgen</code>. The fuzzing pipeline will
generate the inputs for the <code>sle.Rmd</code> R markdown notebook.
That notebook can then be rendered to get all the results (tables,
figures) we use in the paper.</p>
<p>To use the artifact:</p>
<ol type="1">
<li>Install the docker image (see <a
href="#install-the-docker-image">Install the docker image</a>).
Installing locally is possible but involved. Following the steps
described in the <code>docker-image/Dockerfile</code> should help if
this is the hard path you are choosing!</li>
<li>Experiment with the tool on a small example: see <a
href="#experimenting-with-the-tool">Experimenting the tool</a></li>
<li>Reproduce the analysis pipeline: see <a
href="#the-analysis-pipeline">The analysis pipeline</a></li>
</ol>
<p>The tool is packaged as an R library. It is hosted at <a
href="https://github.com/PRL-PRG/signatr">https://github.com/PRL-PRG/signatr</a>.</p>
<p>The artifact is also provided directly on github:
https://github.com/PRL-PRG/sle22-signatr-artifact</p>
<p>You can get it by entering the following commands in a shell:</p>
<div class="sourceCode" id="cb1"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="ex">$</span> git clone [email protected]:PRL-PRG/sle22-signatr-artifact.git</span></code></pre></div>
<h2 id="install-the-docker-image">Install the docker image</h2>
<p>Go in the artifact’s folder:</p>
<div class="sourceCode" id="cb2"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="ex">$</span> cd sle22-signatr-artifact</span></code></pre></div>
<p>To install the docker image, you can:</p>
<ul>
<li>pull the docker image with
<code>docker pull prlprg/sle22-signatr</code>, or</li>
<li>build the docker image (it takes time!):</li>
</ul>
<div class="sourceCode" id="cb3"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="ex">$</span> cd docker-image</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="ex">$</span> make</span></code></pre></div>
<p>After installing the docker image, <strong>make sure</strong> to run
all the following commands in a shell inside the docker image (for
Linux, macOS) from the artifact directory.</p>
<p>To start the docker image, go back to the root directory of the
artifact (<code>sle22-signatr-artifact/</code>) and enter in a
shell:</p>
<div class="sourceCode" id="cb4"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./enter.sh</span></span></code></pre></div>
<p>which should give you a bash shell prompt, like (modulo the
hostname):</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode sh"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="ex">r@eaf63037fd02:/work$</span></span></code></pre></div>
<p>It automatically mounts the content of the folder from which you run
the command into the <code>/work</code> directory in the container.</p>
<p>If you see an output like:</p>
<pre><code>Starting Xvfb...
There is something wrong with the Xvfb server.</code></pre>
<p>try to run it <code>NO_X11</code> environment variable set:</p>
<pre><code>NO_X11=1 ./enter.sh</code></pre>
<p>We also provide a shorter invocation script for Docker
<code>./enter2.sh</code> to run if it still does not work. That one does
not set up permissions so you will have to do <code>sudo</code> for Step
6 in <a href="#the-analysis-pipeline">The analysis pipeline</a>
section.</p>
<h2 id="experimenting-with-the-tool">Experimenting with the tool</h2>
<p>Run the R interpreter <em>inside the docker image</em>. It will start
the patched R interpreter. The tool <em>does not run</em> in the
standard R interpreter.</p>
<p>The following is the screen cast that shows all the commands
executed:</p>
<p><a
href="https://asciinema.org/a/YxDDCvg4SUeEzzUKhfKcDLrCO?idleTimeLimit=1"><img
src="https://asciinema.org/a/YxDDCvg4SUeEzzUKhfKcDLrCO.svg"
alt="asciicast" /></a></p>
<p>In the following listings, <code>$</code> indicates the shell and
<code>></code> denotes the R REPL.</p>
<div class="sourceCode" id="cb8"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="ex">$</span> R</span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="ex">R</span> version 4.0.2 <span class="er">(</span><span class="ex">2020-06-22</span><span class="kw">)</span> <span class="ex">--</span> <span class="st">"Taking Off again"</span></span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a><span class="ex">...</span></span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a><span class="op">></span> library<span class="kw">(</span><span class="ex">signatr</span><span class="kw">)</span></span></code></pre></div>
<p>All following commands and instructions should be run in the docker
container.</p>
<h3 id="database">Database</h3>
<p>To generate a database of values, we need some code to run. One way
is to extract it from an existing R package, for example
<code>stringr</code>, which provides regexes:</p>
<div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> <span class="fu">extract_package_code</span>(<span class="st">"stringr"</span>, <span class="at">output_dir =</span> <span class="st">"demo"</span>)</span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a>...</span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a><span class="dv">7</span> examples<span class="sc">/</span>str_detect.Rd.R examples</span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a>...</span></code></pre></div>
<p>This will extract all the runnable snippets from the package
documentation and tests into the given directory. For example:</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> <span class="fu">cat</span>(<span class="fu">readLines</span>(<span class="st">"demo/examples/str_detect.Rd.R"</span>, <span class="at">n =</span> <span class="dv">15</span>), <span class="at">sep =</span> <span class="st">"</span><span class="sc">\n</span><span class="st">"</span>)</span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a>...</span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a>fruit <span class="ot"><-</span> <span class="fu">c</span>(<span class="st">"apple"</span>, <span class="st">"banana"</span>, <span class="st">"pear"</span>, <span class="st">"pinapple"</span>)</span>
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a><span class="fu">str_detect</span>(fruit, <span class="st">"a"</span>)</span>
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a><span class="fu">str_detect</span>(fruit, <span class="st">"^a"</span>)</span>
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a>...</span></code></pre></div>
<p>Next, we trace the file by running it (in the patched R interpreter)
and recording all the calls, using the
<code>trace_file</code>function:</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> <span class="fu">trace_file</span>(<span class="st">"demo/examples/str_detect.Rd.R"</span>, <span class="at">db_path =</span> <span class="st">"demo.sxpdb"</span>)</span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> status time file db_path db_size error</span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a>elapsed <span class="dv">0</span> <span class="fl">0.04</span> demo<span class="sc">/</span>examples<span class="sc">/</span>str_detect.Rd.R demo.sxpdb <span class="dv">20</span> <span class="cn">NA</span></span></code></pre></div>
<p>The database generation is also automated in the
<code>pipeline-dbgen</code> directory in the artifact, and handles there
tracing on multiple files and merging the results. See <a
href="#generate-the-database">Generate the database</a> for more
details.</p>
<h3 id="fuzzing">Fuzzing</h3>
<p>Once the database is ready, we can start fuzzing the
<code>str_detect</code> function of the <code>stringr</code>
package:</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> fuzz_results <span class="ot"><-</span> <span class="fu">quick_fuzz</span>(<span class="st">"stringr"</span>, <span class="st">"str_detect"</span>, <span class="st">"demo.sxpdb"</span>, <span class="at">budget =</span> <span class="dv">1000</span>, <span class="at">action =</span> <span class="st">"infer"</span>)</span>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a> started a new runner<span class="sc">:</span>PROCESS <span class="st">'R'</span>, running, pid <span class="dv">4157</span></span>
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a> fuzzing stringr<span class="sc">:::</span>str_detect [<span class="sc">==</span><span class="er">====</span>] <span class="dv">100</span><span class="sc">/</span><span class="dv">100</span> (<span class="dv">100</span>%) 39s</span>
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a> stopped runner<span class="sc">:</span>PROCESS <span class="st">'R'</span>, running, pid <span class="dv">4157</span></span></code></pre></div>
<p>The <code>infer</code> action will infer types for each call argument
and return value using the type annotation language described in <a
href="https://dl.acm.org/doi/abs/10.1145/3428249">Designing types for R,
empirically</a>. It returns an R data frame with the inferred call
signature in the <code>result</code> column:</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> fuzz_results</span>
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a><span class="co"># A tibble: 1,000 × 7</span></span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a> args_idx error exit status dispatch result ts</span>
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a> <span class="sc"><</span>list<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>list<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span> <span class="er"><</span>drt<span class="sc">></span></span>
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a> <span class="dv">1</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="st">"Error in UseMethod(</span><span class="sc">\"</span><span class="st">type\… NA 1 <named list> NA 0.04…</span></span>
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a><span class="st"> 2 <int [3]> "</span>Error <span class="cf">in</span> stri_detect_regex… <span class="cn">NA</span> <span class="dv">1</span> <span class="sc"><</span>named list<span class="sc">></span> <span class="cn">NA</span> <span class="fl">0.04</span>…</span>
<span id="cb13-7"><a href="#cb13-7" aria-hidden="true" tabindex="-1"></a> <span class="dv">3</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list<span class="sc">></span> (logi… <span class="fl">0.04</span>…</span></code></pre></div>
<p>If you are repeating these steps, it is possible that your results
will be different since fuzzing is non-deterministic.</p>
<p>The listing shows three calls: two failed ones (non-zero status) with
an error message, and a successful one with an inferred signature.</p>
<p>You can find all the successful calls for your run of the fuzzer:</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> dplyr<span class="sc">::</span><span class="fu">filter</span>(fuzz_results, status <span class="sc">==</span> <span class="dv">0</span>)</span>
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a><span class="co"># A tibble: 112 × 7</span></span>
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a> args_idx error exit status dispatch result ts</span>
<span id="cb14-4"><a href="#cb14-4" aria-hidden="true" tabindex="-1"></a> <span class="sc"><</span>list<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>list<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span> <span class="er"><</span>drt<span class="sc">></span></span>
<span id="cb14-5"><a href="#cb14-5" aria-hidden="true" tabindex="-1"></a> <span class="dv">1</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (logical, character, log… <span class="fl">0.04</span>…</span>
<span id="cb14-6"><a href="#cb14-6" aria-hidden="true" tabindex="-1"></a> <span class="dv">2</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (character, character, l… <span class="fl">0.04</span>…</span>
<span id="cb14-7"><a href="#cb14-7" aria-hidden="true" tabindex="-1"></a> <span class="dv">3</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (character, character, d… <span class="fl">0.04</span>…</span>
<span id="cb14-8"><a href="#cb14-8" aria-hidden="true" tabindex="-1"></a> <span class="dv">4</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (logical, character, log… <span class="fl">0.04</span>…</span>
<span id="cb14-9"><a href="#cb14-9" aria-hidden="true" tabindex="-1"></a> <span class="dv">5</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (logical[], character, l… <span class="fl">0.04</span>…</span>
<span id="cb14-10"><a href="#cb14-10" aria-hidden="true" tabindex="-1"></a> <span class="dv">6</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (character, character, l… <span class="fl">0.04</span>…</span>
<span id="cb14-11"><a href="#cb14-11" aria-hidden="true" tabindex="-1"></a> <span class="dv">7</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (character, character, d… <span class="fl">0.04</span>…</span>
<span id="cb14-12"><a href="#cb14-12" aria-hidden="true" tabindex="-1"></a> <span class="dv">8</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (logical[], character, d… <span class="fl">0.04</span>…</span>
<span id="cb14-13"><a href="#cb14-13" aria-hidden="true" tabindex="-1"></a> <span class="dv">9</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (logical[], character, d… <span class="fl">0.04</span>…</span>
<span id="cb14-14"><a href="#cb14-14" aria-hidden="true" tabindex="-1"></a><span class="dv">10</span> <span class="sc"><</span>int [<span class="dv">3</span>]<span class="sc">></span> <span class="cn">NA</span> <span class="cn">NA</span> <span class="dv">0</span> <span class="sc"><</span>named list [<span class="dv">3</span>]<span class="sc">></span> (logical[], character, d… <span class="fl">0.04</span>…</span></code></pre></div>
<p>The <code>args_idx</code> column contains the indices of the values
of the arguments in the database: the actual argument values can be
obtained by looking up the <code>args_idx</code> in the database:</p>
<div class="sourceCode" id="cb15"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> <span class="fu">library</span>(sxpdb)</span>
<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> db <span class="ot"><-</span> <span class="fu">open_db</span>(<span class="st">"demo.sxpdb"</span>)</span>
<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> <span class="fu">get_value_idx</span>(db, <span class="dv">0</span>) <span class="co"># value at index 0</span></span>
<span id="cb15-4"><a href="#cb15-4" aria-hidden="true" tabindex="-1"></a>[<span class="dv">1</span>] <span class="st">"a"</span></span>
<span id="cb15-5"><a href="#cb15-5" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> <span class="fu">close</span>(db)</span></code></pre></div>
<p>One advantage of using R is that we can use R’s many data analysis
functions. For example, we can look at the resulting signatures:</p>
<div class="sourceCode" id="cb16"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="sc">></span> dplyr<span class="sc">::</span><span class="fu">count</span>(fuzz_results, result)</span>
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a><span class="co"># A tibble: 20 × 2</span></span>
<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a> result n</span>
<span id="cb16-4"><a href="#cb16-4" aria-hidden="true" tabindex="-1"></a> <span class="sc"><</span>chr<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span></span>
<span id="cb16-5"><a href="#cb16-5" aria-hidden="true" tabindex="-1"></a> <span class="dv">1</span> <span class="cn">NA</span> <span class="dv">888</span></span>
<span id="cb16-6"><a href="#cb16-6" aria-hidden="true" tabindex="-1"></a> <span class="dv">2</span> (character, character, logical) <span class="sc">=></span> logical <span class="dv">28</span></span>
<span id="cb16-7"><a href="#cb16-7" aria-hidden="true" tabindex="-1"></a> <span class="dv">3</span> (character, character, double) <span class="sc">=></span> logical <span class="dv">21</span></span>
<span id="cb16-8"><a href="#cb16-8" aria-hidden="true" tabindex="-1"></a> <span class="dv">4</span> (character, character, logical[]) <span class="sc">=></span> logical[] <span class="dv">10</span></span>
<span id="cb16-9"><a href="#cb16-9" aria-hidden="true" tabindex="-1"></a> <span class="dv">5</span> (logical, character, logical) <span class="sc">=></span> logical <span class="dv">7</span></span>
<span id="cb16-10"><a href="#cb16-10" aria-hidden="true" tabindex="-1"></a> <span class="dv">6</span> (logical[], character, logical) <span class="sc">=></span> logical[] <span class="dv">7</span></span>
<span id="cb16-11"><a href="#cb16-11" aria-hidden="true" tabindex="-1"></a> <span class="dv">7</span> (null, character, logical) <span class="sc">=></span> logical[] <span class="dv">7</span></span>
<span id="cb16-12"><a href="#cb16-12" aria-hidden="true" tabindex="-1"></a> <span class="dv">8</span> (logical, character, double) <span class="sc">=></span> logical <span class="dv">5</span></span>
<span id="cb16-13"><a href="#cb16-13" aria-hidden="true" tabindex="-1"></a> <span class="dv">9</span> (logical[], character, double) <span class="sc">=></span> logical[] <span class="dv">5</span></span>
<span id="cb16-14"><a href="#cb16-14" aria-hidden="true" tabindex="-1"></a><span class="dv">10</span> (character[], character, logical) <span class="sc">=></span> logical[] <span class="dv">4</span></span>
<span id="cb16-15"><a href="#cb16-15" aria-hidden="true" tabindex="-1"></a><span class="dv">11</span> (character[], character, double) <span class="sc">=></span> logical[] <span class="dv">3</span></span>
<span id="cb16-16"><a href="#cb16-16" aria-hidden="true" tabindex="-1"></a><span class="dv">12</span> (double, character, logical) <span class="sc">=></span> logical <span class="dv">3</span></span>
<span id="cb16-17"><a href="#cb16-17" aria-hidden="true" tabindex="-1"></a><span class="dv">13</span> (null, character, logical[]) <span class="sc">=></span> logical[] <span class="dv">3</span></span>
<span id="cb16-18"><a href="#cb16-18" aria-hidden="true" tabindex="-1"></a><span class="dv">14</span> (character, character[], logical) <span class="sc">=></span> logical[] <span class="dv">2</span></span>
<span id="cb16-19"><a href="#cb16-19" aria-hidden="true" tabindex="-1"></a><span class="dv">15</span> (null, character[], logical) <span class="sc">=></span> logical[] <span class="dv">2</span></span>
<span id="cb16-20"><a href="#cb16-20" aria-hidden="true" tabindex="-1"></a><span class="dv">16</span> (character, character[], double) <span class="sc">=></span> logical[] <span class="dv">1</span></span>
<span id="cb16-21"><a href="#cb16-21" aria-hidden="true" tabindex="-1"></a><span class="dv">17</span> (double, character, double) <span class="sc">=></span> logical <span class="dv">1</span></span>
<span id="cb16-22"><a href="#cb16-22" aria-hidden="true" tabindex="-1"></a><span class="dv">18</span> (double, character, logical[]) <span class="sc">=></span> logical[] <span class="dv">1</span></span>
<span id="cb16-23"><a href="#cb16-23" aria-hidden="true" tabindex="-1"></a><span class="dv">19</span> (logical, character[], logical) <span class="sc">=></span> logical[] <span class="dv">1</span></span>
<span id="cb16-24"><a href="#cb16-24" aria-hidden="true" tabindex="-1"></a><span class="dv">20</span> (logical[], character, logical[]) <span class="sc">=></span> logical[] <span class="dv">1</span></span></code></pre></div>
<p>This shows that in 3 cases, the fuzzer managed to generate a call
that was successful, and so the signatures of those calls.</p>
<h2 id="the-analysis-pipeline">The analysis pipeline</h2>
<p>The following tutorial demonstrates how to run the analysis pipeline
to reproduce the results of the paper. It consists of a series of steps
that at the end generates the input for the analysis.</p>
<p>In this write up, we will run it on a small subset of the original
packages (cf. <code>data/packages.txt</code>). The reason is that the
size of the data require is fairly large. For example, just the value
database is over 287GB and its generation take over half a day (on a 72
core Intel Xeon 6140 2.30GHz server). Also one would have to download
and install all the packages and their dependencies which again takes
space and time. If you are however interested and have the computational
resource, we will be happy to share the data, please contact the AEC
chair.</p>
<p>There is also a screen cast for this part of the artifact. However,
due to a size limitations, it is not possible to share it directly on <a
href="https://asciinema.org/">asciinema.org</a>. Instead, it is in a
compressed for in the <code>assets</code> directory. To replay it
locally (assuming you have installed the <code>asciinema</code> tool),
please do the following steps:</p>
<div class="sourceCode" id="cb17"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="bu">cd</span> assets</span>
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a><span class="fu">unxz</span> screencast-pipeline.asciinema.xz</span>
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a><span class="ex">asciinema</span> play <span class="at">-i</span> 1 <span class="at">-s</span> 10 screencast-pipeline.asciinema</span></code></pre></div>
<p>That will play it 10x the actual speed, limiting the idle time to 1
second.</p>
<hr />
<p><strong>Note</strong>: - You will be running code downloaded from a
public repository. Despite that CRAN is a curated repository, it should
be done with caution. Run it inside the container.</p>
<ul>
<li>Most steps takes a few minutes at most, long running ones are
indicated with an estimate.</li>
</ul>
<h3 id="steps">Steps</h3>
<p>The following is essentially what is in the Figure 1 and Figure 2 in
the paper, packaged in scripts for simpler use using GNU parallels for
parallel execution. All steps should be run inside a docker container.
As a reminder, to enter the container, run:</p>
<div class="sourceCode" id="cb18"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./enter.sh</span></span></code></pre></div>
<p>Anytime you want to kill a task, it is good to exit the container and
enter it again so all the child processes are properly killed.</p>
<h3 id="get-the-sample-sxpdb-database">0. get the sample sxpdb
database</h3>
<p>For the experiment we need a value database (sxpdb database) that
will be used for the fuzzing. You can either <a
href="#building-it-yourself">build one yourself</a>, or <a
href="https://owncloud.cesnet.cz/index.php/s/aHprMbas4haELVf">download</a>
one we have prepared using the same steps.</p>
<p>To get the prebuilt, one do the following:</p>
<div class="sourceCode" id="cb19"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a><span class="bu">cd</span> data</span>
<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a><span class="fu">wget</span> <span class="at">-O</span> cran_db.tar.xz https://owncloud.cesnet.cz/index.php/s/aHprMbas4haELVf/download</span>
<span id="cb19-3"><a href="#cb19-3" aria-hidden="true" tabindex="-1"></a><span class="fu">tar</span> xvJf cran_db.tar.xz</span></code></pre></div>
<p>The extracted database has about 10GB.</p>
<h4 id="building-it-yourself">Building it yourself</h4>
<p>The database generation uses <a
href="https://docs.ropensci.org/targets/">targets</a> to orchestrate the
pipeline.</p>
<p>The database for the SLE paper is obtained by tracing 400 packages
from <code>data/packages-typer-400.txt</code>. The packages to be traced
have to be specified in <code>data/packages.txt</code>, which contains a
new-line separated list of packages to include in the corpus.</p>
<p>To start tracing, after opening an R session and specifying an
adequate number of parallel workers:</p>
<div class="sourceCode" id="cb20"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a><span class="fu">cp</span> data/packages-typer-400.txt data/packages.txt</span>
<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a><span class="bu">cd</span> pipeline-dbgen</span>
<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a><span class="ex">R</span> <span class="at">-e</span> <span class="st">'targets::tar_make_future(workers = 64)'</span></span></code></pre></div>
<p>The extracted code of the packages will be located in
<code>output/extracted-code</code>. The resulting database will be
generated as <code>output/sxpdb/cran_db</code>. You should move it to
<code>data</code> to follow the next steps. Depending on your machine,
the generation of the database for the 400 packages can take from a few
hours to a few days.</p>
<p>We provide other variants of <code>packages.txt</code>. For instance,
<code>packages-4.txt</code> includes 2 huge and common R packages,
<code>dplyr</code> and <code>ggplot2</code>.</p>
<h3 id="create-a-corpus">1. create a corpus</h3>
<p>The corpus consists of the following:</p>
<ul>
<li>R package sources in <code>data/sources</code></li>
<li>installed R packages <code>data/library</code></li>
<li>extracted code from R packages <code>data/extracted-code</code></li>
<li>corpus metadata file <code>data/corpus.csv</code></li>
</ul>
<p>This is bootstrapped using the <code>data/packages.txt</code>
file.</p>
<p>To create a corpus, run the following:</p>
<div class="sourceCode" id="cb21"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./create-corpus.R</span></span></code></pre></div>
<p>Depending on the number of packages (and their transitive
dependencies), it might take a while. For the sample of 5 packages
(small corpus, though of the very popular packages), it might be ~20
minutes.</p>
<p>It could happen that some dependencies won’t install.</p>
<p>The result should be something like:</p>
<pre><code>data/extracted-code <--- extracted code from R packages
data/library <--- installed R packages
data/sources <--- R package sources
data/corpus.csv <--- corpus metadata</code></pre>
<h3 id="fuzz-the-installed-functions">2. fuzz the installed
functions</h3>
<p>Next, we will run the fuzzer using the values from the sample
database:</p>
<div class="sourceCode" id="cb23"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./run-fuzz.sh</span></span></code></pre></div>
<p>By default this will sample 100 functions from the
<code>corpus.csv</code> and fuzz each 100 times. Both can adjusted by
setting the <code>FUNS</code> and <code>BUDGET</code> environment
variables. Using all the functions
(e.g. <code>FUNS=$(wc -l data/corpus.csv)</code> and 5000 runs
(e.g. <code>BUDGET=5000</code>), the experiment might take about a day.
That is why we recommend to scale it down so it runs within 30 minutes.
By default, it will run 16 jobs in parallel. The can be changed using
the <code>JOBS</code> environment variable.</p>
<p>The result will be:</p>
<pre><code>data/fuzz <--- directory with the fuzzer output
data/run-fuzz.csv <--- metadata about the run, duration, exitcodes, ...</code></pre>
<p>You could view the intermediate results using the
<code>qcat.sh</code> utility. For example:</p>
<div class="sourceCode" id="cb25"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./qcat.R</span> <span class="st">'data/fuzz/dplyr::arg_name'</span></span></code></pre></div>
<p>shall show results for a function <code>arg_name</code> from
<code>dplyr</code> package:</p>
<div class="sourceCode" id="cb26"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb26-1"><a href="#cb26-1" aria-hidden="true" tabindex="-1"></a><span class="co"># A tibble: 100 × 9</span></span>
<span id="cb26-2"><a href="#cb26-2" aria-hidden="true" tabindex="-1"></a> args_idx error exit status dispatch result ts fun_n…¹ rdb_p…²</span>
<span id="cb26-3"><a href="#cb26-3" aria-hidden="true" tabindex="-1"></a> <span class="sc"><</span>list<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>list<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>drt<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span></span>
<span id="cb26-4"><a href="#cb26-4" aria-hidden="true" tabindex="-1"></a> <span class="dv">1</span> <span class="sc"><</span>int [<span class="dv">2</span>]<span class="sc">></span> <span class="st">"Error in … NA 1 <named list> NA 0.08… dplyr:… ../rdb…</span></span>
<span id="cb26-5"><a href="#cb26-5" aria-hidden="true" tabindex="-1"></a><span class="st"> 2 <int [2]> "</span>Error <span class="cf">in</span> … <span class="cn">NA</span> <span class="dv">1</span> <span class="sc"><</span>named list<span class="sc">></span> <span class="cn">NA</span> <span class="fl">0.11</span>… dplyr<span class="sc">:</span>… ..<span class="sc">/</span>rdb…</span>
<span id="cb26-6"><a href="#cb26-6" aria-hidden="true" tabindex="-1"></a> <span class="dv">3</span> <span class="sc"><</span>int [<span class="dv">2</span>]<span class="sc">></span> <span class="st">"Error in … NA 1 <named list> NA 0.14… dplyr:… ../rdb…</span></span>
<span id="cb26-7"><a href="#cb26-7" aria-hidden="true" tabindex="-1"></a><span class="st"> 4 <int [2]> "</span>Error <span class="cf">in</span> … <span class="cn">NA</span> <span class="dv">1</span> <span class="sc"><</span>named list<span class="sc">></span> <span class="cn">NA</span> <span class="fl">0.15</span>… dplyr<span class="sc">:</span>… ..<span class="sc">/</span>rdb…</span>
<span id="cb26-8"><a href="#cb26-8" aria-hidden="true" tabindex="-1"></a> <span class="dv">5</span> <span class="sc"><</span>int [<span class="dv">2</span>]<span class="sc">></span> <span class="st">"Error in … NA 1 <named list> NA 0.09… dplyr:… ../rdb…</span></span>
<span id="cb26-9"><a href="#cb26-9" aria-hidden="true" tabindex="-1"></a><span class="st"> 6 <int [2]> "</span>Error <span class="cf">in</span> … <span class="cn">NA</span> <span class="dv">1</span> <span class="sc"><</span>named list<span class="sc">></span> <span class="cn">NA</span> <span class="fl">0.53</span>… dplyr<span class="sc">:</span>… ..<span class="sc">/</span>rdb…</span>
<span id="cb26-10"><a href="#cb26-10" aria-hidden="true" tabindex="-1"></a> <span class="dv">7</span> <span class="sc"><</span>int [<span class="dv">2</span>]<span class="sc">></span> <span class="st">"Error in … NA 1 <named list> NA 0.11… dplyr:… ../rdb…</span></span>
<span id="cb26-11"><a href="#cb26-11" aria-hidden="true" tabindex="-1"></a><span class="st"> 8 <int [2]> NA NA 0 <named list> 30 0.09… dplyr:… ../rdb…</span></span>
<span id="cb26-12"><a href="#cb26-12" aria-hidden="true" tabindex="-1"></a><span class="st"> 9 <int [2]> NA NA 0 <named list> 31 0.09… dplyr:… ../rdb…</span></span>
<span id="cb26-13"><a href="#cb26-13" aria-hidden="true" tabindex="-1"></a><span class="st"> 10 <int [2]> NA NA 0 <named list> 32 0.09… dplyr:… ../rdb…</span></span>
<span id="cb26-14"><a href="#cb26-14" aria-hidden="true" tabindex="-1"></a><span class="st">...</span></span></code></pre></div>
<p>It indicates 7 failed calls and 3 good ones. Please note that due to
random sampling your results will likely be different. It is also
possible that there will not be any
<code>data/fuzz/dplyr::arg_name</code> file as the functions are
selected randomly.</p>
<h3 id="type-the-results">3. type the results</h3>
<p>To type the traces, run the following:</p>
<div class="sourceCode" id="cb27"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb27-1"><a href="#cb27-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./run-type.sh</span></span></code></pre></div>
<p>By default, it will run 16 jobs in parallel. The can be changed using
the <code>JOBS</code> environment variable.</p>
<p>The result will be:</p>
<pre><code>data/types <--- directory with the type output
data/run-type.csv <--- metadata about the run, duration, exitcodes, ...</code></pre>
<p>We can again peek the results:</p>
<div class="sourceCode" id="cb29"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb29-1"><a href="#cb29-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./qcat.R</span> <span class="st">'data/types/dplyr::arg_name'</span></span></code></pre></div>
<p>which should show types inferred from the fuzzed calls:</p>
<div class="sourceCode" id="cb30"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb30-1"><a href="#cb30-1" aria-hidden="true" tabindex="-1"></a><span class="co"># A tibble: 40 × 3</span></span>
<span id="cb30-2"><a href="#cb30-2" aria-hidden="true" tabindex="-1"></a> fun_name id signature</span>
<span id="cb30-3"><a href="#cb30-3" aria-hidden="true" tabindex="-1"></a> <span class="sc"><</span>chr<span class="sc">></span> <span class="er"><</span>int<span class="sc">></span> <span class="er"><</span>chr<span class="sc">></span></span>
<span id="cb30-4"><a href="#cb30-4" aria-hidden="true" tabindex="-1"></a> <span class="dv">1</span> dplyr<span class="sc">::</span>arg_name <span class="dv">8</span> (list<span class="sc"><</span>list<span class="sc"><</span>class<span class="sc"><</span>unit, unit_v2<span class="sc">></span> <span class="er">|</span> double <span class="sc">|</span> integer<span class="sc">></span> <span class="er">|</span> …</span>
<span id="cb30-5"><a href="#cb30-5" aria-hidden="true" tabindex="-1"></a> <span class="dv">2</span> dplyr<span class="sc">::</span>arg_name <span class="dv">9</span> (class<span class="sc"><</span>gList<span class="sc">></span>, list<span class="sc"><</span>class<span class="sc"><</span>factor<span class="sc">></span> <span class="er">|</span> double <span class="sc">|</span> integer<span class="sc">></span>)…</span>
<span id="cb30-6"><a href="#cb30-6" aria-hidden="true" tabindex="-1"></a> <span class="dv">3</span> dplyr<span class="sc">::</span>arg_name <span class="dv">10</span> (pairlist, list<span class="sc"><</span>character <span class="sc">|</span> double[]<span class="sc">></span>) <span class="sc">=></span> class<span class="sc"><</span>glue, …</span>
<span id="cb30-7"><a href="#cb30-7" aria-hidden="true" tabindex="-1"></a> <span class="dv">4</span> dplyr<span class="sc">::</span>arg_name <span class="dv">13</span> (list<span class="sc"><</span>list<span class="sc"><</span>class<span class="sc"><</span>matrix<span class="sc">></span> <span class="er">|</span> double[] <span class="sc">|</span> integer <span class="sc">|</span> intege…</span>
<span id="cb30-8"><a href="#cb30-8" aria-hidden="true" tabindex="-1"></a> <span class="dv">5</span> dplyr<span class="sc">::</span>arg_name <span class="dv">14</span> (character[], list<span class="sc"><</span>character <span class="sc">|</span> logical<span class="sc">></span>) <span class="sc">=></span> class<span class="sc"><</span>glue…</span>
<span id="cb30-9"><a href="#cb30-9" aria-hidden="true" tabindex="-1"></a> <span class="dv">6</span> dplyr<span class="sc">::</span>arg_name <span class="dv">15</span> (list<span class="sc"><</span>class<span class="sc"><</span>unit, unit_v2<span class="sc">></span><span class="er">></span>, list<span class="sc"><</span>list<span class="sc"><</span>class<span class="sc"><</span>expectati…</span>
<span id="cb30-10"><a href="#cb30-10" aria-hidden="true" tabindex="-1"></a> <span class="dv">7</span> dplyr<span class="sc">::</span>arg_name <span class="dv">17</span> (list<span class="sc"><</span>class<span class="sc"><</span>call<span class="sc">></span><span class="er">></span>, double[]) <span class="sc">=></span> class<span class="sc"><</span>glue, character<span class="sc">></span></span>
<span id="cb30-11"><a href="#cb30-11" aria-hidden="true" tabindex="-1"></a> <span class="dv">8</span> dplyr<span class="sc">::</span>arg_name <span class="dv">24</span> (list<span class="sc"><</span>class<span class="sc"><</span>margin, simpleUnit, unit, unit_v2<span class="sc">></span> <span class="er">|</span> class…</span>
<span id="cb30-12"><a href="#cb30-12" aria-hidden="true" tabindex="-1"></a> <span class="dv">9</span> dplyr<span class="sc">::</span>arg_name <span class="dv">28</span> (class<span class="sc"><</span>matrix<span class="sc">></span>, list<span class="sc"><</span>class<span class="sc"><</span>expectation_success, expect…</span>
<span id="cb30-13"><a href="#cb30-13" aria-hidden="true" tabindex="-1"></a><span class="dv">10</span> dplyr<span class="sc">::</span>arg_name <span class="dv">30</span> (double, class<span class="sc"><</span>titleGrob, gTree, grob, gDesc<span class="sc">></span>) <span class="sc">=></span> clas…</span></code></pre></div>
<h3 id="fuzz-coverage">4. fuzz coverage</h3>
<p>Computing the function source code coverage from the fuzzed calls is
done by running the following:</p>
<div class="sourceCode" id="cb31"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb31-1"><a href="#cb31-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./run-coverage.sh</span></span></code></pre></div>
<p>This will use the traced data to recreate the calls while using the
<a href="https://covr.r-lib.org/">covr</a> tool to record code coverage.
By default, it will run 16 jobs in parallel. The can be changed using
the <code>JOBS</code> environment variable.</p>
<p>The result will be:</p>
<pre><code>data/coverage <--- directory with the coverage output
data/run-coverage.csv <--- metadata about the run, duration, exitcodes, ...</code></pre>
<h3 id="baseline">5. baseline</h3>
<p>To have a comparison, we need to need to get the baseline data.
Instead of fuzzing, we will simply run the extracted code from the
packages. There are three steps:</p>
<ol type="1">
<li><p>run the extracted code to get the traces</p>
<div class="sourceCode" id="cb33"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb33-1"><a href="#cb33-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./run-baseline.sh</span></span>
<span id="cb33-2"><a href="#cb33-2" aria-hidden="true" tabindex="-1"></a><span class="ex">./traces-baseline.R</span></span></code></pre></div>
<p>This might be a bit longer running - about 15 minutes.</p></li>
<li><p>type the traces</p>
<div class="sourceCode" id="cb34"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb34-1"><a href="#cb34-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./run-type-baseline.sh</span></span></code></pre></div></li>
<li><p>compute the coverage from these traces</p>
<div class="sourceCode" id="cb35"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb35-1"><a href="#cb35-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./run-coverage-baseline.sh</span></span></code></pre></div>
<p>This might be a bit longer running - about 15 minutes.</p></li>
</ol>
<p>By default, all will run 16 jobs in parallel. The can be changed
using the <code>JOBS</code> environment variable.</p>
<p>The results will be in</p>
<pre><code>data/baseline <--- baseline traces
data/baseline-types <--- baseline types
data/baseline-coverage <--- baseline coverage
data/run-*-baseline.csv <--- metadata about the runs, duration, exitcodes, ...</code></pre>
<h3 id="create-a-report">6. create a report</h3>
<p>Finally, to render the results, run:</p>
<div class="sourceCode" id="cb37"><pre
class="sourceCode sh"><code class="sourceCode bash"><span id="cb37-1"><a href="#cb37-1" aria-hidden="true" tabindex="-1"></a><span class="ex">R</span> <span class="at">--slave</span> <span class="at">--quiet</span> <span class="at">-e</span> <span class="st">'rmarkdown::render("sle.Rmd")'</span></span></code></pre></div>
<p>This should create a file <code>sle.html</code> which you can open in
a browser (navigate to the directory where you run the
<code>./enter.sh</code>). It also creates three more files: -
<code>experiment-uf.tex</code> the data for the paper -
<code>argsdb-value-distribution.pdf</code> figure 3 in the paper -
<code>uf-call-signatures.pdf</code> figure 4 in the paper</p>
<hr />
<p><strong>Note</strong>:</p>
<ul>
<li>Regarding the coverage, most likely a small number of fuzzed calls
won’t find a new paths, so in the report you will see 0 - as to better
coverage.</li>
</ul>