Skip to content

Releases: lnx-search/lnx

v0.9.0 Master

05 Oct 16:57
23a1df8
Compare
Choose a tag to compare

v0.9.0 Master

This is a release cut from the current master branch before the 0.10 work begins.

What's Changed

  • LNX-97: Add Cargo.lock by @ChillFish8 in #98
  • Fix non-stored fields appearing as nulls in JSON search response by @oka-tan in #100
  • Use snowball stemmer stop words for some languages by @saroh in #102

New Contributors

Full Changelog: 0.9.0...0.9.0-master

πŸš€ Version 0.9.0

25 Jun 15:57
6fbb7de
Compare
Choose a tag to compare

Version 0.9.0

This is a breaking release and will require you you re-index your data and re-create schemas.

It's been a little while since the last release, 0.9 isn't a huge release however, a lot of work has gone into preparing for 0.10 which hopefully should add high availability to the search instances.

What's New

  • Synonym support is now added! Finally! My life is complete! I can Retire! On a more serious note yes it's added and can be adjusted using the /indexes/:index/synonyms endpoint via POST, GET and DELETE requests respectively. there's also a /indexes/:index/synonyms/clear DELETE endpoint which allows you to clear all synonyms. The syntax for adding synonyms is a semi-relation structure where you provide a list of strings in the format of <word>,<word>:<synonym>, <synonym>,<synonym> which will set all the words on the left of the : to have the given synonyms. This allows you to define synonyms fairly easy for related words e.g. iphone,apple,phone:apple,phone,iphone
  • All loaded stop words can be viewed via the /indexes/:index/stopwords GET endpoint.
  • Returned documents are now converted to be in line with the defined schema i.e fields with multi set to false will be returned as single values and if no value is set it will be returned as null rather than just missing the field entirely.
  • You can now mark fields as required which will cause lnx to reject any documents that are missing required fields (this will also reject fields that are provided but have empty values i.e "foo": []

What's Different

  • Fields now have a multi field attribute to set them as being multi-value. If they're not multi-value but multiple values are provided, the system will take the last value in the array.
  • The fast attribute for fields is now a bool rather than single/multi because generally, it was a bit confusing to users to know when they wanted single or when they wanted multi, now this is an internal thing. You only have to worry about saying if you want it to be a fast field or not.
  • Fast fuzzy now scores by edit distance and the BM25 score, making for much better relevancy when searching.
  • Fast fuzzy uses traditional word -> terms lookups vs the compound correction

What's Fixed

  • lnx will now return an error if you try and sort by multi-value fields, this was a panic if you had the fast-field cardinality set correctly.

What's Changed

  • Improve schema validations and Query logic in #68
  • Implement schema conversion to returned docs and general improvements in #69
  • Cleanup query info, hint and add synonym support in #71
  • Altered the way fast-fuzzy queries are produced and scored #95
  • Moved from a custom fork of SymSpell to dedicated Compose repo #96
  • Added a fields attribute for fuzzy queries allowing for selective field searches, as part of #93

New Contributors

Full Changelog: 0.8.1...0.9.0

⚠️ Version 0.9.0-beta

23 Jan 23:15
79903fe
Compare
Choose a tag to compare
Pre-release

Version 0.9.0-beta

This is a beta version/pre-release of version 0.9.0 so that people are to make use and test some of the nicer quality of life changes like the new schema conversion, more sane defaulting behaviour and synonym support.

This release won't have any documentation to go with it directly as technically it's a pre-release. That being said this is a breaking release and will require you you re-index your data and re-create schemas.

What's new

  • Synonym support is now added! Finally! My life is complete! I can Retire! On a more serious note yes it's added and can be adjusted using the /indexes/:index/synonyms endpoint via POST, GET and DELETE requests respectively. there's also a /indexes/:index/synonyms/clear DELETE endpoint which allows you to clear all synonyms. The syntax for adding synonyms is a semi-relation structure where you provide a list of strings in the format of <word>,<word>:<synonym>, <synonym>,<synonym> which will set all the words on the left of the : to have the given synonyms. This allows you to define synonyms fairly easy for related words e.g. iphone,apple,phone:apple,phone,iphone
  • All loaded stop words can be viewed via the /indexes/:index/stopwords GET endpoint.
  • Returned documents are now converted to be in line with the defined schema i.e fields with multi set to false will be returned as single values and if no value is set it will be returned as null rather than just missing the field entirely.
  • You can now mark fields as required which will cause lnx to reject any documents that are missing required fields (this will also reject fields that are provided but have empty values i.e "foo": []

What's different

  • Fields now have a multi attribute to set them as being multi-value. If they're not multi-value but multiple values are provided, the system will take the last value in the provided array.
  • The fast attribute for fields are now a bool rather than single/multi this because generally, it was a bit confusing to users to know when they wanted single or when they wanted multi, now this is an internal thing and you only have to worry about saying if you want it to be a fast field or not.

What's fixed

  • lnx will now return an error if you try and sort by multi-value fields, this was a panic if you had the fast-field cardinality set correctly.

What's Changed

  • Improve schema validations and Query logic in #68
  • Implement schema conversion to returned docs and general improvements in #69
  • Cleanup query info, hint and add synonym support in #71

Full Changelog: 0.8.1...0.9.0-beta

πŸ›Version 0.8.1

18 Jan 09:34
8d00cbc
Compare
Choose a tag to compare

Version 0.8.1

Fixes

  • Sever not responding/shutting down correct to SIGTERM and SIGQUIT signals.

πŸš€ Version 0.8.0

12 Jan 20:30
78437cb
Compare
Choose a tag to compare

Version 0.8.0

Version 0.8.0 brings a considerable set of improvements making it the best version to use for production applications. Several bug fixes, logging improvements and debugging measures have been improved as well as some of the issues surrounding fast-fuzzy.

What's new

  • Local docs removed, We no longer serve the local openapi copy of the docs on the /docs endpoint. This became quite a burden to maintain and keep up to date in two places rather than just redirecting to https://docs.lnx.rs which now supports previous version via the ?version=<major>.<minor> flag e.g. https://docs.lnx.rs?version=0.8
  • Queries are now followed the type: { <context> } pattern for payloads rather than having a base value field, and then having each query kind inconsistently require additional context. See below for an example.
  • Delete by query mode, this will delete the matched documents based on your search query. This respects the limits and offsets you give the query so to delete all you may need to send several requests.
  • Delete specific document endpoint, this allows you to delete a document via DELETE /indexes/:index/documents/:document_id.
  • Indexes are now backed by sled for index metadata storage. This is incredibly useful going forward ensuring atomic behaviour when writing to and from disk with the fast-fuzzy spell correction system. This is a breaking change however, the data sub-folder of this new directory is a fully Tantivy compatible index. Theoretically, you can mount a custom program directly to this folder in order to perform actions.
  • Fast-fuzzy garbage collection, this means that the frequency dictionaries are adjusted again when deleting documents which should prevent potential relevancy loss when working with a system that has a high update rate. The adjustments are only made when calling commit or auto-commit runs the operation.
  • Tracing information, we have moved from log to tracing providing significantly more information for debugging and profiling in future releases, most notably this gives us the ability to add OpenTelemetry tracing later on.
  • JSON log files (--json-logs) lnx can now produce line-by-line json logs where each object is a new log event. This is an amazing addition for anyone using an ingestion system or wanting to parse the logs.
  • Pretty logs (--pretty-logs) for the extra flamboyant users. This makes reading the logs much easier / prettier but at the cost of taking up quite a few more lines per event. Looks nice though.
  • Verbose logs (--verbose-logs) adds additional metadata to each log including thread-name and thread-id.
  • Disable ASNI colours (--disable-asni-logs) This is mostly required for logging to files or ingestion systems. Doesn't look as nice though without the colours :(
  • Logging Directory (--log-directory <dir>) This replaces the --log-file attribute and instead produces hourly log files in the directory, formatted using the aforementioned flags.
  • RUST_LOG env var support. For more control over what to log and what not to log you can directly adjust the RUST_LOG env var which will override the log-level flag or env var. By default lnx will set this to be <log-level-flag>,compress=off,tantivy=info
  • Snapshot support added snapshot support is now supported, this is essentially a wrapper around zipping the index files up and unzipping them again. You can set an automatic snapshot with the --snapshot-interval <interval> flag, take a single snapshot with the --snapshot subcommand, adjust the output directory with the --snapshot-directory <dir> flag and finally load snapshots with the --load-snapshot <file> flag. Each snapshot is generated with the name format of snapshot-<utc-timestamp>-lnx-v<lnx-version> it's important to note that older snapshots may not be compatible with future lnx releases. Although this should be avoided between most versions.

What's been fixed

  • lnx now handles interrupt signals better and correctly cleans up indexes more reliably, this should prevent dangling locks in future.

What's been removed

  • --log-file has been removed in favour of --log-directory powered by tracing.
  • The storage type memory has been depreciated/downgraded, the system will now treat this the same as tempdir until it's removed in future versions. This came out of reasoning that realistically theirs not much difference between the two other than tempdir is more reliable on bigger indexes allowing the OS to page in and out of disk.
  • The original --pretty-logs flag has been changed from disabling ASNI colours to Enabling pretty logging, this used to be pretty unintuitive before but now this actually does what the name suggests.

New Query Pattern

Before your query style would be

{
  "query": {
    "value": "foo",
    "kind": "fuzzy"
  }
}

now it's

{
  "query": {
    "fuzzy": { "ctx": "foo" }
  }
}

For more info see https://docs.lnx.rs/?version=0.8#tag/Run-searches

πŸ› οΈ Version 0.7.1

20 Dec 14:39
c27799c
Compare
Choose a tag to compare

Version 0.7.1

This adds no new features but does fix the docs not loading the new openapi spec and also reduces memory consumption for the fast-fuzzy system by about 20-40% depending on index size.

πŸš€ Version 0.7.0

21 Nov 16:06
0c77ccf
Compare
Choose a tag to compare

Version 0.7.0

0.7 brings with it a lot of quality of life changes, bug fixes and features. Unlike previous releases, this has backwards compatibility with 0.6.x systems.

What's New

  • Sentence suggestion endpoint:
    This gives you the ability to suggest sentences based on the corpus data, although this does not guarantee that the corrections will be correct according to the language, instead it will correct words to inline with sentences within the corpus data itself. (This is not a Grammarly system)
  • Multi-Field term handling:
    This allows you to now specify multiple fields for a single term by passing an array of field names rather than a single string on the term query kind.
  • Sensible Defaults:
    This now allows you to skip the writer_threads and writer_buffer fields should you choose and a sensible set of defaults will be calculated based on your current system's specs. This typically will allocate either n number of threads where n is the number of logical CPU cores or 8 is the absolute max. The writer buffer is generally going to be 10% of your total memory or the bare minimum buffer size for the number of threads, whatever is higher.
  • New Allocator:
    We now use MiMalloc allocator which not only means performance consistency across operating systems and containers but also adds a slight boost to performance.
  • Auto Commit:
    You can now set an auto_commit value on an index in seconds determining how much time no more operations be submitted should elapse before lnx automatically begins processing and committing documents. If this value is 0 (default) this is disabled.

What's Fixed

  • Index frequencies now save properly:
    The bug where index frequencies were not correctly being persisted to and from disk has been fixed.
  • lnx no longer uses 100% of one core per index:
    Before for every index or more specifically writer-actor, there was it would max that given thread out due to an infinite loop of checking the channels without blocking.

Full Changelog: 0.6.2...0.7.0

πŸ› 0.6.2 - Auth Fixes

13 Oct 21:10
Compare
Choose a tag to compare

Fixes whitelisting indexes to given index tokens.

Before the system just completely ignores the field which is reasonably insecure, in 0.6.2 this is now fixed and will reject any unauthorized token with a 401.

πŸ› 0.6.1 - Minor changes

12 Oct 21:10
Compare
Choose a tag to compare

This patch fixes the 422 status not presenting itself on a validation error and reformats the codebase. This should have been in 0.6.0 but I was a bit too trigger happy with the publish button.

No other changes have been made

πŸš€ Version 0.6.0

12 Oct 20:46
aaa70f6
Compare
Choose a tag to compare

Version 0.6.0

0.6 is the biggest update we've released since the initial launch of 0.1, this brings with it a complete redesign of the engine, fast fuzzy system and server design.
This has vastly improved the performance and most importantly maintainability of the codebase before 0.6 was starting to show issues of too many things doing the same thing in different .laces.
We also took the opportunity during the redesign to cut out several dependencies and requirements e.g. sqlx with sqlite3 and axum which while they worked fine, added a lot of dependencies for a setup we didn't need them for, hence why they were dropped and moved to more lightweight or existing solutions (axum was moved to just hyper + routerify and sqlx was replaced with tantivy's inbuilt storage system).

What's New?

The new engine brings many many breaking changes, but we believe they're worth it!

  • Facet fields - hierarchical facets are now supported and can be added by the facet field type and access like a file path e.g. /tools/hammers via a Term query.
  • Term queries - These work similar to the Normal mode except that they are not fed through the parser so query values will be treated as literal strings. This is especially useful for the new facet fields.
  • Combination queries - You can now construct any combination of query kinds and adjust if they should, must or must not appear in matched documents etc... This allows you to truly create any range of queries you need. I cannot stress enough how awesome this feature is after playing around with it in testing.
  • Forgiving values - Lnx will now attempt to convert values into their required type if possible before rejecting a request so things like "3" become 3 for integer fields and DateTime can be converted from a UTC timestamp or formatted string.
  • Single or Multi-value inputs - Lnx now supports the ability to apply operations with a single or multiple values, this includes things like delete queries which now take all fields into consideration (although these are treated as an OR not an AND)
  • Reversible results - You can now order your results in ascending to descending order.
  • Togglable search request logging - We understand that not everyone wants to completely disable info logging just to stop logging every search request so we've made it an optional flag to pass (--silent-search)

What's Changed?

It's not just new things being added! We've overhauled the existing designs as well!

  • Queries moved from GET -> POST - Queries are now done via POST requests as to allow for the new combination query system.
  • Deletes now support multi-field and multi-value options - You can now delete by several fields and values at once rather than doing many individual requests.
  • Bulk insertion optimisations - Bulk documents are no longer handled one by one internally in channels which allows for mild performance improvements for large payloads.
  • Unit tests - Yes that's right, we actually have some now! And more on the way. This should hopefully help get us closer to release and make sure everything is running smoother.
  • Smaller binary size - Docker images and the like now come in at almost 3x smaller sizes (Only 3.4MB!)
  • Changeable stop words - You now change what stop words are used on a per-index basis, otherwise a sensible multi-language set of defaults are used.
  • Fast fuzzy no longer uses pre-set dicts - This is a major change for our fast fuzzy system, using the document frequencies instead of pre-setting defaults has reduced memory usage from a constant 1.3 - 2GB of memory to a couple of hundred MB on large indexes. (NOTE: This will increase the more documents are added with unique words, so it is a good idea to every so often to reload the index and re-upload docs, this will lower resource usage and also improve relevancy)
  • More in-depth permissions - Permissions have been moved to bit fields which allow for some more fine-grain control of access.

What's Gone?

Alas, not everything has stayed the same, with some of the framework changes some things were removed.

  • TLS support, ultimately it was decided that Lnx should be behind a reverse proxy anyway / internally used so TLS doe not serve many purposes.

What's Changed

Full Changelog: 0.5.0...0.6.0