We introduce independent view fuzzing, a novel, fully automated approach for detecting non-crashing functional bugs in Android apps. We have realized this approach as an automated, end-to-end functional fuzzing tool, Genie.
Given an app, (1) Genie automatically detects non-crashing bugs without requiring human-provided tests and oracles (thus fully automated); and (2) the detected non-crashing bugs are diverse (thus general and not limited to specific functional properties), which set Genie apart from prior work.
[1] "Fully Automated Functional Fuzzing of Android Apps for Detecting Non-Crashing Logic Bugs" Ting Su, Yichen Yan, Jue Wang, Jingling Sun, Yiheng Xiong, Geguang Pu, Ke Wang, Zhendong Su. In SPLASH/OOPSLA 2021.
@article{10.1145/3485533,
author = {Su, Ting and Yan, Yichen and Wang, Jue and Sun, Jingling and Xiong, Yiheng and Pu, Geguang and Wang, Ke and Su, Zhendong},
title = {Fully Automated Functional Fuzzing of Android Apps for Detecting Non-Crashing Logic Bugs},
year = {2021},
issue_date = {October 2021},
volume = {5},
number = {OOPSLA},
doi = {10.1145/3485533},
journal = {Proc. ACM Program. Lang.},
month = oct,
articleno = {156},
numpages = {31}
}
(corresponding to RQ4: Bug Types and Characteristics)
This bug type leads to user data/setting lost. Sometimes, this bug type can bring severe consequences and critical user complaints.
Example Issue from Markor
Bug report (link)
This bug type means one specific app functionality that works well before suddenly cannot proceed anymore and loses effect.
Example Issue from SkyTube
Bug report (link)
This bug type means the specific functionality shows wrong behavior w.r.t. its previous correct behavior.
Example Issue from ActivityDiary
Bug report (link)
This bug type means the GUI states are inconsistent for specific functionality, which counters users’ intuition on app function.
Example Issue from Transistor
Bug report (link)
This bug type means some GUI views are erroneously duplicated.
Example Issue from Tasks
Bug report (link)
This bug type means some GUI views inadvertently disappear.
Example Issue from Fosdem
Bug report (link)
This bug type means some views are incorrectly displayed.
Example Issue from RadioDroid
Bug report (link)
UnitConverter (1,000,000~50,000,000 installations on Google Play, 144 Github stars)
-
User data/setting lost - bug report (link)
-
This issue escaped from developer/user testing for more than 4 years.
Markor (50,000~100,000 installations on Google Play, 1200+ Github stars)
-
Inconsistent GUI states - bug report (link)
-
This issue escaped from developer/user testing for more than 2.5 years and affected 74 releases.
Bug summary (link)
Bug report generated by Genie during functional fuzzing (link)
Explanation:
-
The left column is the seed test (randomly generated), while the right column is the mutant test. The two tests are aligned to ease inspection (the indpendent event trace is representd by trace [4,5] in the mutant test).
-
For each test, the GUI pages denote the layouts (GUI states), and the text besides each page denotes the GUI event that works on the previous page and leads to the current page. The icon on top of the text is the receiver view of the event on the previous page.
-
The oracle checking checking is conducted on the layouts highlighted by the two red boxes. The GUI consistencies are highlighted by small red boxes on the corresponding GUI pages. Due to the GUI effect of seed test, i.e., chaning from
<No Activity>
toEmpty Activity
is not contained in that of mutant test, i.e., chaning fromGardening
to<No Activity>
. Thus, a likely functional bug was detected and it is a true positve. -
Model, seed, and mutant visualization
-
The concrete GUI transitional model (link);
-
the abstracted GUI transitional model (link);
-
the seed test (
seed-test-7
in this case) (link); -
the generated mutant test (
mutant-659
in this case) from the seed test at 3th insertion position: (link).
Create a fresh Android 6.0 emulator:
avdmanager create avd --force --name testAVD_Android6.0 --package 'system-images;android-23;google_apis;x86' --abi google_apis/x86 --sdcard 512M --device 'Nexus 7'
Modify the fresh emulator for Genie (add files into sdcard and remove unnecessary default apps):
emulator -avd testAVD_Android6.0 &
python3 -m deploy.emulator init -s emulator-5554
Start the emulator in the read-only mode (without leaving any side-effect for next start-up)
emulator -avd testAVD_Android6.0 -read-only &
Now, we have three phases:
python3 -m droidbot.start -d emulator-5554 -a apps_for_test/de.rampro.activitydiary_118.apk -policy weighted -count 500 -grant_perm -is_emulator -interval 2 -o ./tmp-diary -script user_script.json
Here,
-d
: the device serial number
-a
: the file path of apk
-policy
: weighted
(recommended), dfs_greedy
-count
: the maximum allowed events (2000
, recommended)
-grant_perm
: grant all permissions while installing the apk
-is_emulator
: declare the target device to be an emulator
-interval
: interval in seconds between each two events (Default: 1). This option is especially used for the situation: the app has dynamic features (like the ActivityDiary app we tested). In practice, 2 should be enough. If the app is quite stable and your testing environment is good, 0 or 1 is also okay.
Note: In general, we cannot ensure the app state can always reach stable during testing. Internally, Genie implements a simple workaround to check whether the app state reaches stable.
-o
: the output directory
-script
: the optional file path of user-defined script, which contains a sequence of input events (e.g., bypass welcome page, login user account)
If you want to only generate mutant tests (without actual execution), then use the following command line:
python3 -m droidbot.start -d emulator-5554 -a apps_for_test/de.rampro.activitydiary_118.apk -policy fuzzing_gen -count 100000 -max_seed_test_suite_size 10 -max_random_seed_test_length 15 -max_independent_trace_length 8 -max_mutants_per_insertion_position 30 -grant_perm -is_emulator -interval 1 -coverage -o ./tmp-diary -script user_script.json
Here,
-policy
: fuzzing_gen
(generate seed tests and their corresponding mutants, but do not execute the mutant tests, used for distributed testing (see below)); or
fuzzing_gen_seeds
(only generate seed tests)
-multi-mutant-gen n
: use multiprocessing when generating mutants, n is optional, default to number of seed tests
-count
: the maximal number of allowed events for generating seed tests (just give a large enough number, e.g., 100000
);
-max_seed_test_suite_size
: the maximal number of seed tests to generate;
-max_random_seed_test_length
: the maximal length of each seed test;
-max_independent_trace_length
: the maximal length of independent trace (#independent events) that is inserted into the seed test;
-max_mutants_per_insertion_position
: the maximal number of mutants generated for one insertion position;
-max_mutants_per_seed_test
: the maximal number of mutants generated for one seed test;
-coverage
: dump the coverage for seed tests (the apk should be already instrumented);
-o
: the output directory used in Step 1, which is used to recover the model constructed in Step 1
Note:
-
You can specify
-max_seed_test_suite_size
,-max_random_seed_test_length
,-max_independent_trace_length
,-max_mutants_per_insertion_position
and-max_mutants_per_seed_test
to control the number of generated mutant tests. -
We recommend to add
-interval 1
to make sure the seed generation process is stable. -
By default, Genie use
weighted
policy to generate random seed tests. -
In this mode, we do not need to specify
-config-script
, which is used for oracle checking and relocate views during actual execution of mutant tests.
python3 -m droidbot.start -d emulator-5554 -a apps_for_test/de.rampro.activitydiary_118.apk -policy fuzzing_run -config-script script_samples/diary_activity_ignore_view_diffs_script.json -grant_perm -is_emulator -keep_app -o ./tmp-diary/ -mutant ./tmp-diary/seed-tests/seed-test-1/mutant-1 -coverage
Here,
-policy
: fuzzing_run
(run the mutant)
-mutant
: the dir of one specified mutant (Genie will automatically infer its seed test)
-keep_app
: keep the app after running the mutant
-config-script
: the script that specifies which views or the order of children views can be ignored when do oracle checking (this file is very important to reduce false positives).
Note:
-
Please do not use
-keep_env
, which may bring some issues. -
Please use
-keep_app
, which will not uninstall the app after running the mutant test. Before running each mutant, Droidmutant will restore the app to its original state. So, we do not need to worry about this. -
We may not need to specify
-interval
.
python3 -m droidbot.start -d emulator-5554 -a apps_for_test/de.rampro.activitydiary_118.apk -policy fuzzing -count 10000 -max_seed_test_suite_size 10 -max_random_seed_test_length 15 -max_independent_trace_length 8 -max_mutants_per_insertion_position 30 -config-script script_samples/diary_activity_ignore_view_diffs_script.json -grant_perm -is_emulator -interval 1 -o ./tmp-diary -script user_script.json
Here,
-policy
: fuzzing (it generates one property-preserving mutant and executes it at one time, mainly used for debugging). If we use fuzzing_gen
, then we only generate mutants and dump them without actual execution. This option is used for distributed testing (see below);
-count
: the total number of events for generating seed tests (just give it a large enough number);
-max_seed_test_suite_size
: number of seed tests to generate;
-max_random_seed_test_length
: the maximal length of each seed test;
-max_independent_trace_length
: the maximal length of independent trace that is inserted into the seed test;
-max_mutants_per_insertion_position
: the maximal number of mutants generated for one insertion position;
-max_mutants_per_seed_test
: the maximal number of mutants generated for one seed test;
-o
: specify the same directory as Phase 1, which we use to recover the model constructed in Phase 1
-coverage
: dump the coverage for seed tests and mutant tests
-config-script
: the script that specifies which views or the order of children views can be ignored when do oracle checking (this file is very important to reduce false positives).
We provide a script to run model construction and seed/mutant generation together.
python3 deploy/prerun.py base apps_for_test/de.rampro.activitydiary_118.apk ./tmp-diary --no-headless --script script_samples/user_script.json --offset 3
Positional arguments: avd
, apk
, out
Optional arguments:
--no-model
: do not construct model
--no-seed
: do not generate seeds and mutants
tmp-diary/
-- data of the original utg model
tmp-diary/{app_package_name}_testing_result.txt
-- the final testing results (valid only when use 2.1)
tmp-diary/seed-tests/
-- data of all randomly generated seed tests
tmp-diary/seed-tests/seed-test-1/
-- data of the seed test and all mutants
tmp-diary/seed-tests/seed-test-1/mutant-1
-- data of the mutant test
tmp-diary/seed-tests/seed-test-1/mutant-1/index_x.html
-- the execution results with annotation info to highlight the semantic errors
tmp-diary/seed-tests/seed-test-1/mutant-1/checking_result.json
-- the detailed results of oracle checking
tmp-diary/seed-tests/seed-test-1/mutant-1/gui_diff_analysis.txt
-- the summary results of oracle checking
We currently support distributed testing on Android emulators.
python3 -m deploy.start --no-headless -n 8 --apk apps_for_test/de.rampro.activitydiary_118.apk -o ./tmp-diary/ --script script_samples/diary_activity_ignore_view_diffs_script.json --timeout 900 --offset 2
Here,
-n N
: number of emulators/devices that will be used for distributed testing
--offset N
: specify the starting emulator serial number. If N=1, starting from emulator-5556
--timeout
: the maximum allowed testing time allocated for each mutant test. If timeouts, we will run the mutant one more time and give up if it timeouts again.
--script
: the ignore view diffs script
Other options:
--no-headless
: do not hide the emulators
--no-trie-reduce
: do not use trie to prune infeasible mutant tests (By default, we do not add this option. We use trie to prune infeasible mutant tests.)
--no-trie-load
: do not load trie data from previous log (By default, we do not add this option. We use load trie data from previous run.)
--no-skip
: do not skip any executed mutant tests (By default, we do not add this option. We will skip any executed mutant tests if we restart the fuzzing.)
--no-coverage
: do not dump coverage of mutant tests (By default, we do not add this option. We will dump coverage for each mutant test.)
--seeds-to-run
: all
, or ids of seed tests to run, like "1;2;3" (run all the mutant tests of seed tests with ids 1, 2, 3, remember to add quotes)
--debug
or --test-single-mutant
: the file path of a mutant (run a single mutant), this implies --no-skip
--interval
: interval in seconds between each two events (Default: 0)
--trie-delete
: delete infeasible mutants according to the trie structure during fuzzing
Note: In some cases, we may hope to rerun the mutants of some specific seeds, we can simply append --no-skip
to the original command line. But if we also add
--no-trie-reduce
, then all the mutants will be executed again but will not use the previous trie to prune unreplayable mutants. Another way is to manually
delete the corresponding log files and rerun.
Please see script_samples/run_genie.sh
for reference.
python3 -m droidbot.postprocess -o ./tmp-diary/ -f script_samples/app-ActivityDiary-script/checking_config.json
Output Report, e.g.,
./tmp-diary/merged_results.txt
The report includes:
-
#mutants
and#executed_mutants
-
crash errors
andsemantic errors
We can focus on Crash Errors
and Semantic Errors
.
$ python3 -m deploy.coverage.merge <output-dir> --seed-cov-out <merged-seed-coverage-file-name> --mutant-cov-out <merged-mutant-coverage-file-name>
or
$ python3 -m deploy.coverage.merge <output-dir> --seed --mutant
, in this case seed.ec
and mutant.ec
will be generated under <output-dir>
Example (merge all coverage files of seeds and mutants):
python3 -m deploy.coverage.merge /mnt/droidbot-share/test-ActivityDiary-r1 --seed-cov --mutant-cov
Merge coverage files of specific seeds or mutants
python3 -m deploy.coverage.merge /mnt/droidbot-share/test-ActivityDiary-r1 --seed-cov --mutant-cov [--seeds '1;2;3'] [--mutants '1;2;3']
$ python3 -m deploy.coverage.report --project ~/Projs/app-coverage-analysis/apps/org.jtb.alogcat_43_src-gradle --class app/build/intermediates/classes/ --source app/src/main/java/ coverage.ec[, coverages] <html-report-gen>
Example:
python3 -m deploy.coverage.report --project ~/droid_repo/ActivityDiary --class app/build/intermediates/javac/debug/compileDebugJavaWithJavac/classes/ /mnt/droidbot-share/test-ActivityDiary-r1/mutant.ec /mnt/droidbot-share/test-ActivityDiary-r1-mutant-report
To debug and verify Genie, we offer a number of debugging strategies.
This strategy can let us inspect whether Genie can indeed generate specific mutants from a given seed test. Usually, this seed test is also specified by us.
python3 -m droidbot.start -d emulator-5554 -a apps_for_test/de.rampro.activitydiary_118.apk -policy fuzzing_gen -count 100000 -max_seed_test_suite_size 1 -max_random_seed_test_length 1 -max_independent_trace_length 8 -max_mutants_per_insertion_position 100 -grant_perm -is_emulator -interval 1 -coverage -script script_samples/diary_seed_test.json -o ./tmp-diary
Note:
-
We add the option
-script script_samples/diary_seed_test.json
so that Genie will execute this predefined seed test first before generating random seed tests. -
We set
-max_seed_test_suite_size 1 -max_random_seed_test_length 1
so that Genie will only execute the given seed test and will not generate random seed tests. -
We set
-max_independent_trace_length 8 -max_mutants_per_insertion_position 100
to control the mutant generation.
We provide a functionality that can show the seed test, mutant test, critical events (necessary for mutation) on the clustered utg. After successfully running the following commands, it will dump a file named index_clustered_utg_with_annotated_test.html
under
the output directory, which annotates the seed test or the mutant test in red and critical views in blue.
Note: When a seed test was executed, the utg event ids (of the transitions of this seed test) will be updated (i.e., increased). Thus, when you execute a seed test and generate a number of mutant test, and want to verify the results, you should run the following commands accordingly, so that it can reflect the change of the utg event ids.
Example usages:
Show a given seed test on the clustered utg.
python3 -m droidbot.debug --apk apps_for_test/instrumented_apps/org.tasks.debug-gplay-6.6.5.apk --output test-tasks-model-1/ --seed test-tasks-model-1/seed-tests/seed-test-17
Show a given mutant test (with its seed test) on the clustered utg.
python3 -m droidbot.debug --apk apps_for_test/instrumented_apps/org.tasks.debug-gplay-6.6.5.apk --output test-tasks-model-1/ --mutant test-tasks-model-1/seed-tests/seed-test-17/mutant-113/
python3 -m droidbot.debug --apk apps_for_test/instrumented_apps/org.tasks.debug-gplay-6.6.5.apk --output test-tasks-model-1/ test-tasks-model-1/seed-tests/seed-test-17 --views script_samples/app-tasks-script/critical_views.json