Skip to content

Latest commit

 

History

History
235 lines (174 loc) · 7.85 KB

README.md

File metadata and controls

235 lines (174 loc) · 7.85 KB

Hyperion

Of Hyperion we are told that he was the first to understand, by diligent attention and observation, the movement of both the sun and the moon and the other stars, and the seasons as well, in that they are caused by these bodies, and to make these facts known to others; and that for this reason he was called the father of these bodies, since he had begotten, so to speak, the speculation about them and their nature.
— Diodorus Siculus (5.67.1)

Hyperion is a tool aiming at analysing Java test programs, to generate multiple similarity metrics. To this end, hyperion relies on JBSE to carry out symbolic execution of JUnit test programs, generate prolog facts, and carry out multiple analyses on these facts.

Dependencies

There are several dependencies to hyperion:

  • JBSE (currently bundled in the project)
  • z3
  • SWI Prolog

z3 is an external dependency, which should be available in the system path for the tool to correctly run. Similarly, SWI Prolog must be manually installed in the system.

Configuring integration with SWI Prolog

To interact with SWI Prolog, hyperion uses JPL. While java bindings are resolved through maven, there is the need to let JPL know how to interact with SWI Prolog. To this end, some environmental variables should be set. Depending on your OS, this configuration requires some care. You can refer the official deployment pages, depending on your OS:

Running

The source is organized as a maven project, so running mvn build should be enough to get everything up and running.

Hyperion can be then run specifying different command line options, as follows.

Running Symbolic Analysis

To run the symbolic analysis, the command is:

java -cp target/hyperion-shaded-1.0-SNAPSHOT.jar --analyze <path to JSON config file>

A sample JSON config file to drive the analysis of sets of test programs is located at src/main/resources/analyze-config.json in this repository. The structure of this JSON configuration file is:

{
  "sut": [
    "path to classes 1",
    "path to classes 2"
  ],
  "testPrograms": [
    "path to test classes 1",
    "path to test classes 1"
  ],
  "includeTest": [
    "list",
    "of",
    "@Test",
    "methods",
    "to",
    "analyze"
  ],
  "excludeTest": [
    "list",
    "of",
    "@Test",
    "methods",
    "to",
    "skip"
  ],
  "additionalClasspath": [
    "path",
    "to",
    "any",
    "other",
    "needed",
    "dependency"
  ],
  "excludeTracedPackages": [
    "java/",
    "sun/"
  ],
  "depth": 100,
  "timeout": 5,
  "skip": 0,
  "outputFile": "invokes.pl"
}

If outputFile is not set, the output file is defaulted to inspection-YYYY-MM-DDTHH:SSZ.pl, allowing different runs to store the generated invokes in a different file.

Computing Similarity Relations

To compute similarity relations, the command is:

java -cp target/hyperion-shaded-1.0-SNAPSHOT.jar --extract-similarity <path to JSON config file>

A sample JSON config file to drive the analysis of sets of test programs is located at src/main/resources/similarity-config.json in this repository. The structure of this JSON configuration file is:

{
  "invokes": [
    "path", "to", "list", "of", "invoke", "files", "generated", "from", "analyze", "phase"
  ],
  "regex": "path to prolog rules defining regular expressions to match endpoints",
  "metric": "the name of the similarity metric to run",
  "outputFile": "file to dump the test similarity groups"
}

An example of a regex file can be found in the repository at src/main/resources/sose/URI-regex-list.pl.

The metrics to specify are the ones discussed in the next section about Prolog.

If outputFile is not set, the output is directed to stdout,

Playing with Prolog

Encoding of method invocations

Format for the Prolog facts invokes in files like this one:

invokes(TestProgram, BranchingPointList, SeqNum, Caller, ProgramPoint, FrameEpoch, PathCondition, Callee, Parameters)

Encoding of remote API invocations

Format for the Prolog facts endpoint:

endpoint(TestProgram, Caller, HTTPMethod, URI)

It states that the method Caller of the test program TestProgram invokes the remote API identified by URI using the HTTP method HTTPMethod. These facts can be generated by using the helper predicate generate_and_assert_endpoints described below.

Basic queries for reasoning about similarity

To load similarity_relations.pl:

consult('src/main/prolog/similarity_relations.pl').

which defines the basic rules for evaluating similarity among test programs.

To load testing_similarity_relations.pl:

consult('src/main/prolog/testing_similarity_relations.pl').

which loads similarity_relations.pl and also defines some utility predicates to generate reports on similarity analysis.

To get an execution trace Trace of the method annotated as @Test in the test program TP:

trace(TP,Trace).

(Trace is a list of invokes).

To get a maximal sequence of direct method invocations MSeq performed by a caller M in the test program TP:

invoke_sequence(TP,M,ISeq), invokes_callees(ISeq,MSeq).

(ISeq is a list of invokes).

To generate endpoint Prolog facts

generate_and_assert_endpoints(EpSrc).

If EpSrc=trace, the endpoint facts will be generated from traces. If EpSrc=iseq, the endpoint facts will be generated from sequences of method invocations.

To get a pair of similar test programs TP1 and TP2:

similar_tp(EpSrc,SimCr,TP1,TP2,Es1,Es2).

where:

  • EpSrc specifies the source of the endpoint facts,
  • SimCr specifies the criterion to evaluate similarity between TP1 and TP2 (it can be substituted for any of these values: nonemptyEqSet, nonemptySubSet, nonemptyIntersection),
  • Es1 and Es2 are nonempty lists of endpoint facts generated from the invokes facts of TP1 and TP2, respectively, that make the test programs similar.

To get the similarity score between two list of endpoint facts:

similarity_score(SimCr,Es1,Es2,Score).

Evaluating similarity of test programs: a step-by-step guide

  • Step 1. Load testing_similarity_relations.pl:
consult('src/main/prolog/testing_similarity_relations.pl').
  • Step 2. Load rest _api_regex facts representing regular expressions used to match API URIs (4th component of endpoint facts) in evaluating the similarity of endpoint facts:
consult('src/test/resources/report/URI-regex-list.pl').
  • Step 3. Run the following query to:
    • generate the endpoint facts from the execution traces (by using generate_and_assert_endpoints(EpSrc)),
    • evaluate similarity between test programs using the nonemptyIntersection criterion (by using similar_tp(EpSrc,SimCr,TP1,TP2,Es1,Es2)), and
    • compute the similarity score (by using similarity_score(SimCr,Es1,Es2,Score)).
similarity_from_invokes_file('src/test/resources/report/inspection-invokes.pl',trace,nonemptyIntersection).

where inspection-invokes.pl is the dataset of invokes facts used to generate the endpoint facts. The above query generates two files:

  • similarEndpoints-trace-report.csv, including the pairs of similar programs (3rd and 4th column) together with the corresponding score (5th column);

  • similarEndpoints-trace-report.txt, including the pairs of similar programs together with the lists of endpoint facts Es1 and Es2 of TP1 and TP2, respectively, that makes the two test programs similar.