DummySearch is Full Text Search and text comparsion engine. Its work is based on the TF-IDF metric. All operations with data are performed via REST API. Documents in index may have some extra data, but it not uses in search.
You can use any language, but engine uses snowball stemmer (https://github.com/kljensen/snowball), so languages list restricted with:
- English,
- Spanish (español),
- French (le français),
- Russian (ру́сский язы́к),
- Swedish (svenska),
- Norwegian (norsk)
Dummysearch calculates TF-IDF automatically in background every UpdatePeriod time.
$ go build -o build/dummysearch cmd/dummysearch/main.go
$ docker build -t dummysearch .
$ ./build/dummysearch
$ docker run -p 6745:6745 -it -d dummysearch
Creates index with specified config. See: config
$ curl --location --request POST 'http://localhost:6745/' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "lol",
"config": {
"language": "english",
"updatePeriod": "120s",
"autoUpdate": true,
"customIds": false
}
}'
Response:
{
"status": true,
"payload": {
"Message": "OK"
}
}
curl --location --request DELETE 'http://localhost:6745/lol/'
Response:
{
"status": true,
"payload": {
"Message": "OK"
}
}
curl --location --request POST 'http://localhost:6745/lol/' \
--header 'Content-Type: application/json' \
--data-raw '{
"content": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.",
"meta": {
"someField": "any value",
"otherField": 1
},
"id": "1"
}'
Response:
{
"status": true,
"payload": {
"Message": "OK",
"DocumentId": "1"
}
}
curl --location --request POST 'http://localhost:6745/lol/batch' \
--header 'Content-Type: application/json' \
--data-raw '[
{
"content": "some text!",
"meta": {
"foo": "bar"
},
"id": "1"
},
{
"content": "london is the capital of great britain",
"meta": {
"bar": "baz"
},
"id": "2"
},
{
"content": "any other, text.",
"meta": {
"foo": "bar2"
},
"id": "3"
},
]'
Response:
{
"status": true,
"payload": {
"Message": "OK",
"DocumentIds": [
"1",
"2",
"3"
]
}
}
curl --location --request GET 'http://localhost:6745/lol/update'
Response:
{
"status": true,
"payload": {
"Message": "Index updating"
}
}
Source text content not stored, so you can only receive document meta.
curl --location --request GET 'http://localhost:6745/lol/0'
Response:
{
"status": true,
"payload": {
"Doc": {
"Meta": {
"otherField": 1,
"someField": "any value"
}
}
}
}
curl --location --request DELETE 'http://localhost:6745/lol/0'
Response:
{
"status": true,
"payload": {
"Message": "OK"
}
}
curl --location --request GET 'http://localhost:6745/lol/search?query=lorem%20london'
Response:
{
"status": true,
"payload": [
{
"DocId": "2",
"Meta": {
"bar": "baz"
},
"Score": 0.26726124191242445
},
{
"DocId": "0",
"Meta": {
"otherField": 1,
"someField": "any value"
},
"Score": 0.07669649888473704
}
]
}
curl --location --request GET 'http://localhost:6745/lol/compare?doc1=1&doc2=3'
Response:
{
"status": true,
"payload": {
"score": 0.14907119849998599
}
}
- Language - language for index. One index have only one language. If text in document contain other language words simply will not stemmed.
- UpdatePeriod - duration for update TF-IDF values. For example if UpdatePeriod is "60s" and AutoUpdate enabled. Calculating will be started every 60 seconds, but process check that index has changes
- AutoUpdate - Enable or disable AutoUpdate. If AutoUpdate disabled you must call Calculate TF-IDF endpoint