Do you have a problem with duplicated functions in your coding project, such as Java? This may be the solution of finding them.
Specify the location of your repository in the local-config.properties
file (just make a copy of config.properties
) and run the code! Then you will get a file in the "results
"-folder with all the places where functions match.
- Query an LLM (large language model) on complicated situations where 2 or more functions could be similar.
- Configuration - it's easy to change the configurations.
- Comparison - matching through names of lines and functions.
- Tested on both UNIX and Windows.
To run the test files you can run this command with Gradle:
./gradlew test
To run tests directly in IntelliJ or Eclipse (or a program of your choice). Just run the testing files - and it should be working.
These are only suggestions, you may use any source and model as you want.
-
Either download the bin-file directly from Hugging Face which we used for testing: ggml-model-gpt4all-falcon-q4_0.bin (4.06 GB)
-
Or choose downloads from the website for gpt4all-falcon-ggml: https://huggingface.co/nomic-ai/gpt4all-falcon-ggml/tree/main
-
You mou may also download any other model of your taste, and use that one instead.
Create and update the local file local-config.properties
(make a copy of config.properties
and edit it.), then after running you will get a file in a "results" folder with places where functions/methods of different code are being similar or matching.
- Move the downloaded LLM into a new folder called
models
in the main repository. - In
local-config.properties
, setLLM_FILE
to the name of your downloaded model, for examplemodel/ggml-model-gpt4all-falcon-q4_0.bin
. - In
local-config.properties
, setRUN_WITH_LLM
totrue
.
By default results will be shown in the ./results/
folder. Also, by default, the naming would be result_{nr}
where {nr}
is represented by a number that is incremental (until RESULT_FILE_LIMIT).
You can edit configurations in local-config.properties
. However, the main configuration will still be loaded from Properties first. After that Local Properties and Testing Properties (the testing properties is reset every run).
Property | Default values if not set (if not required) | Type | Description |
---|---|---|---|
LOGGING_LEVEL | INFO | String | Which level to show the logging in. |
PATH_TO_CODE | src/test/resources/ | String | Required attribute. Where the code is located. |
PATH_TO_RESULTS | results | String | Results directory. |
RESULT_NAME_PREFIX | result_{nr} | String | Result name. {nr} will be replaced by an incremental number. |
RESULTS_FILE_LIMIT | 100 | Integer | Since it's creating a new file each run, there is a limit. |
EXCLUDED_PATHS | [] | List | Separated by commas. Files/Paths to exclude during a run. |
RUN_WITH_LLM | false | String | Whether to use LLM during the run or not. |
LLM_FILE | models/ggml-model-gpt4all-falcon-q4_0.bin | String | If running with LLM, this has to be set linking to a model. |
LLM_THREADS | 4 | Integer | How many threads the LLM will be using. |
TEMP_FILE_ENABLED | false | Boolean | Whether to use a temporary file. |
TEMP_FILE | debugFile.txt | String | Where the temporary file is located. |
The different options are the following:
ERROR | WARNING | INFO | DEBUG | TRACE | |
---|---|---|---|---|---|
NONE | |||||
ERROR | X | ||||
WARNING | X | X | |||
INFO | X | X | X | ||
DEBUG | X | X | X | X | |
TRACE | X | X | X | X | X |