Releases: Smile-SA/elasticsuite
2.10.20
Main features
Ability to ignore leading, trailing and consecutive zeroes in SKUs
New settings are available in a new section of the Store Configuration in Stores > Configuration > Elasticsuite > Analyzers Settings > Reference analyzer configuration
- Remove leading zeroes of numeric parts
- Remove trailing zeroes of numeric parts
- Reduce series of contiguous zeroes in numeric parts
Those new settings (disabled by default) apply to fields using the reference
analyzer which is dedicated to searchable fields containing SKUs, UPCs or manufacturer part numbers.
Depending on your SKUs or part number schemes and the way your users search for them, it could help them find your products.
Please be aware that
- any change to those three new settings will require a full catalogsearch_fulltext reindex to take effect
- those settings will fully work only if you also enable the following "SKU search" dedicated experimental settings
- Elasticsuite > Search Relevance > Spellchecking Configuration > Term Vectors Configuration > [Experimental] Use reference analyzer in term vectors
- (possibly) Elasticsuite > Search Relevance > Spellchecking Configuration > Term Vectors Configuration > [Experimental] Use all tokens from term vectors
- Elasticsuite > Search Relevance > Relevance Configuration > Exact Match Configuration > [Experimental] Use default analyzer in exact matching filter query
Admin notification for non-indexed tracker events
A new admin notification will pop up in the "Elasticsuite > Analytics > Search Usage" page when the tracker events storage table contains events created more than 6 hours ago, as a way to indicate to you a possible reason for behavioral data missing from the dashboard.
A new settings available under Stores > Configuration > Elasticsuite > Analytics > Pending events configuration > Hours before warning allows to control the timespan between a tracker event being recorded in the database and the admin notification complaining that it has not been indexed yet so could
- augment that timespan on a non-production environment where you almost never run the cronjobs
- reduce that timespan on a trafic heavy production environment where you do not want tracker events to accumulate for too long
📦 Features
- [Analysis] Allow ignoring leading, trailing and consecutive zeroes in… by @rbayet in #3243
- [Analytics] Notice/warning about old events left in the events queue … by @rbayet in #3226
- [Core] Server distr. and version in admin footer info by @rbayet in #3225
- [Search] Use case insensitive query for span matching. by @romainruaud in #3240
🐞 Fixes
- [GraphQL] Fix #3231 allow fetching products with category_uid IN clauses. by @romainruaud in #3241
- [VirtualCategories] Fix #3234 Prevent to cache null query by @PierreGauthier in #3237
🧰 Quality & Tools
- [Tools] Updating Quality workflow to node16 by @rbayet in #3227
- [Tools] Updating PHPStan/Integration workflows to node16 by @rbayet in #3228
Full Changelog: 2.10.19.3...2.10.20
2.11.5.3
Emergency maintenance release
Releases 2.10.19.2 and 2.11.5.2 published yesterday can cause an issue if you minify your JS sources, preventing JS scripts from working in Magento's admin area.
Thanks @thomas-kl1 for a quick reaction !
🐛 Fixes
- [Quality] Remove invalid characters in new JS files by @thomas-kl1 in #3222
Full Changelog: 2.11.5.2...2.11.5.3
2.10.19.3
Emergency maintenance release
Releases 2.10.19.2 and 2.11.5.2 published yesterday can cause an issue if you minify your JS sources, preventing JS scripts from working in Magento's admin area.
Thanks @thomas-kl1 for a quick reaction !
🐛 Fixes
- [Quality] Remove invalid characters in new JS files by @thomas-kl1 in #3222
Full Changelog: 2.10.19.2...2.10.19.3
2.11.5.2
Main features
Additional KPIs in Analytics > Search Usage
KPIs about category views, product views, products added to cart and sales event are now extracted from your behavioral data and visible in the "Analytics > Search Usage dashboard".
Removal of the hardcoded 'sku' field from exact queries filtering part
Historically, when filtering (not scoring !) results, exact match queries always targeted at least the search
collector field (containing all searchable attributes content) and the sku
field, whatever search weight was assigned to the sku
field.
As
- on one hand, we have introduced back in releases 2.10.17 and 2.11.3 (experimental) settings to specifically target in that filtering query part all fields using the
reference
search analyzer (used by thesku
attribute by default) - on the other hand, you might be in a situation where you do not want at all the
sku
to be searchable
we have decided to remove that hardcoded sku
in exact queries filtering part.
Long story short, if your users frequently perform searches by SKU and you're still using the reference
analyzer for that attribute, we urge you to go enable the following Search Relevance settings listed below :
- Elasticsuite > Search Relevance > Spellchecking Configuration > Term Vectors Configuration > Use all tokens from term vectors
- Elasticsuite > Search Relevance > Spellchecking Configuration > Term Vectors Configuration > Use reference analyzer in term vectors
- Elasticsuite > Search Relevance > Relevance Configuration > Exact matching configuration > Use default analyzer in exact matching filter query
Ability to set custom number of primary shards and replicas by index type
If you're using a 3 nodes cluster in your production environment, you could have been tempted to set 3 primary shards and 2 replicas in the legacy Elasticsuite base settings.
While it is our opinion that having multiple primary shards is hardly necessary unless you really have a sizeable product index, it is now possible to go this route while keeping usually light indices (categories, thesaurus, and tracker indices on low traffic sites) with a single primary shard.
It is now possible to definer per index type the number of primary shards and replicas of indices.
This will allow you to minimize the memory footprint per node of opened shards/indices in your cluster (the rule of thumb being that 1 GB of Heap is required per node per 20 shards just for those indices being opened).
Admin notifications about invalid tracker events and automated removal
Releases 2.10.18.3 and 2.11.4.3 introduced some protections to avoid indexing invalid tracker events missing unique visitor or session ids into the behavioral data indices (as well as CLI tools to remove those already indexed).
Those invalid tracker events were left in the Elasticsuite tracker DB table with a flag is_invalid = 1
.
This release introduces a cronjob task that will automatically and periodically remove them and a notification system to warn you about their existence (if you want to investigate the potential issue in your tracker tags).
You also have the ability to remove all of them at once.
📦 Features
- [Analytics] Additional product/category/sales KPIs by @rbayet in #3214
- [Config] Make getIndicesSettingsConfigParam public. by @rbayet in #3195
- [Core] Feature custom number of shards replicas by @romainruaud in #3185
- [Core] Append default CONFIG cache tag to ES configuration and also a generic ES cache tag by @romainruaud in #3191
- [Core] Enable Elasticsuite ES cache tag automatically. by @romainruaud in #3197
- [Cache] Add cache on search query building by @PierreGauthier in #3193 and #3213
- [GraphQL] Add suggestions in GraphQl results by @rbayet in #3053
- [LayeredNavigation] Optimize "see more" action on navigation filter by @thomas-kl1 in #3201
- [Search] Allow configurable msm for fuzzy search by @rbayet in #3196
- [Search] Remove hardcoded SKU from query builder since new options should allow to explicitely target it by @romainruaud in #3198
- [Search] Introducing knn query by @romainruaud in #3175
- [Search] Adding knn field config by @romainruaud in #3176
- [Tracker] Automated removal of queued invalid events by @rbayet in #3204
🐛 Fixes
- [Optimizer] Clean Applier Definition by @PierreGauthier in #3181
- [Spellchecker] Fix multi index by alias error by @PierreGauthier in #3199
- [Tracker] Ensuring tracker provided store_id is numeric. by @romainruaud in #3187
- [VirtualCategories] Add isActiveFilter to prevent disabled categories from being found by @wahidnory in #3159
🧰 Quality
- [Quality] Spellchecker unit tests by @rbayet in #3207
- [Quality] Allow own mapping impl. by using interface instead of mapping directly by @mvenghaus in #3170
- [Quality] M2 2.4.7 compatiblity: Adding missing parameter into constructor by @romainruaud in #3180
New Contributors
- @mvenghaus made their first contribution in #3170
- @wahidnory made their first contribution in #3159
- @thomas-kl1 made their first contribution in #3201
Full Changelog: 2.11.5.1...2.11.5.2
2.10.19.2
Main features
Additional KPIs in Analytics > Search Usage
KPIs about category views, product views, products added to cart and sales event are now extracted from your behavioral data and visible in the "Analytics > Search Usage dashboard".
Removal of the hardcoded 'sku' field from exact queries filtering part
Historically, when filtering (not scoring !) results, exact match queries always targeted at least the search
collector field (containing all searchable attributes content) and the sku
field, whatever search weight was assigned to the sku
field.
As
- on one hand, we have introduced back in releases 2.10.17 and 2.11.3 (experimental) settings to specifically target in that filtering query part all fields using the
reference
search analyzer (used by thesku
attribute by default) - on the other hand, you might be in a situation where you do not want at all the
sku
to be searchable
we have decided to remove that hardcoded sku
in exact queries filtering part.
Long story short, if your users frequently perform searches by SKU and you're still using the reference
analyzer for that attribute, we urge you to go enable the following Search Relevance settings listed below :
- Elasticsuite > Search Relevance > Spellchecking Configuration > Term Vectors Configuration > Use all tokens from term vectors
- Elasticsuite > Search Relevance > Spellchecking Configuration > Term Vectors Configuration > Use reference analyzer in term vectors
- Elasticsuite > Search Relevance > Relevance Configuration > Exact matching configuration > Use default analyzer in exact matching filter query
Ability to set custom number of primary shards and replicas by index type
If you're using a 3 nodes cluster in your production environment, you could have been tempted to set 3 primary shards and 2 replicas in the legacy Elasticsuite base settings.
While it is our opinion that having multiple primary shards is hardly necessary unless you really have a sizeable product index, it is now possible to go this route while keeping usually light indices (categories, thesaurus, and tracker indices on low traffic sites) with a single primary shard.
It is now possible to definer per index type the number of primary shards and replicas of indices.
This will allow you to minimize the memory footprint per node of opened shards/indices in your cluster (the rule of thumb being that 1 GB of Heap is required per node per 20 shards just for those indices being opened).
Admin notifications about invalid tracker events and automated removal
Releases 2.10.18.3 and 2.11.4.3 introduced some protections to avoid indexing invalid tracker events missing unique visitor or session ids into the behavioral data indices (as well as CLI tools to remove those already indexed).
Those invalid tracker events were left in the Elasticsuite tracker DB table with a flag is_invalid = 1
.
This release introduces a cronjob task that will automatically and periodically remove them and a notification system to warn you about their existence (if you want to investigate the potential issue in your tracker tags).
You also have the ability to remove all of them at once.
📦 Features
- [Analytics] Additional product/category/sales KPIs by @rbayet in #3214
- [Config] Make getIndicesSettingsConfigParam public. by @rbayet in #3195
- [Core] Feature custom number of shards replicas by @romainruaud in #3185
- [Core] Append default CONFIG cache tag to ES configuration and also a generic ES cache tag by @romainruaud in #3191
- [Core] Enable Elasticsuite ES cache tag automatically. by @romainruaud in #3197
- [Cache] Add cache on search query building by @PierreGauthier in #3193 and #3213
- [Search] Allow configurable msm for fuzzy search by @rbayet in #3196
- [Search] Remove hardcoded SKU from query builder since new options should allow to explicitely target it by @romainruaud in #3198
- [Tracker] Automated removal of queued invalid events by @rbayet in #3204
🐛 Fixes
- [Optimizer] Clean Applier Definition by @PierreGauthier in #3181
- [Spellchecker] Fix multi index by alias error by @PierreGauthier in #3199
- [Tracker] Ensuring tracker provided store_id is numeric. by @romainruaud in #3187
🧰 Quality
Full Changelog: 2.10.19.1...2.10.19.2
2.11.5.1
Main changes
We fixed the elision/contraction management for French, Italian and Catalan when indexing/searching content.
Historically the text analyzers for those languages were supposed to ignore the elision/contraction of some articles or pronouns but didn't due to an improper order of text transformation components.
This could be problematic if you had a Minimum Should Match lower than 100%
For instance :
- in French:
l'avion
(mandatory form of "le avion" - "the plane") was supposed to be indexed asavion
(plane) but was still indexed asl avion
, so searching forl'avion bleu vole
could matchl'écureuil bleu vole
with a 75% minimum should match (matches onl
,bleu
andvole
) - in Italian:
comedia dell'arte
was supposed to be indexed ascomedia arte
but was indexed ascomedia dell arte
- in Catalan:
l'avi
(the grandfather) was supposed to be indexed asavi
(grandfather) but was indexed asl avi
This release fixes that issues and now the following elisions/contractions will be removed as intended :
- for French:
l, m, t, qu, n, s, j, d, c
as defined in that language specific elision filter configuration - for Italian:
c, l, all, dall, dell, nell, sull, coll, pell, gl, agl, dagl, degl, negl, sugl, un, m, t, s, v, d
as defined in that language specific elision filter configuration - for Catalan:
d, l, m, n, s, t
as defined in that language specific elision filter configuration
Should this changes affect you negatively, please be aware that you can override those settings (as all of those in the elasticsuite_analysis.xml configuration file) in a custom module.
Feel also free to let us know, especially about the configuration Italian and Catalan since we are not native speakers and we are just following official Elasticsearch recommendations.
📦 Features
- [Community] Integrating HS form in dashboard. by @romainruaud in #3155
- [Thesaurus] Feature #3115, add a warning message for existing terms in other thesaurus by @vahonc in #3128
🐛 Fixes
- [Analytics] Exclude search usage chart type from i18n by @rbayet in #3161
- [Search] Fix french stemmer on "clef/clefs" and better elision management by @PierreGauthier in #3156
Full Changelog: 2.11.5...2.11.5.1
2.10.19.1
Main changes
We fixed the elision/contraction management for French, Italian and Catalan when indexing/searching content.
Historically the text analyzers for those languages were supposed to ignore the elision/contraction of some articles or pronouns but didn't due to an improper order of text transformation components.
This could be problematic if you had a Minimum Should Match lower than 100%
For instance :
- in French:
l'avion
(mandatory form of "le avion" - "the plane") was supposed to be indexed asavion
(plane) but was still indexed asl avion
, so searching forl'avion bleu vole
could matchl'écureuil bleu vole
with a 75% minimum should match (matches onl
,bleu
andvole
) - in Italian:
comedia dell'arte
was supposed to be indexed ascomedia arte
but was indexed ascomedia dell arte
- in Catalan:
l'avi
(the grandfather) was supposed to be indexed asavi
(grandfather) but was indexed asl avi
This release fixes that issues and now the following elisions/contractions will be removed as intended :
- for French:
l, m, t, qu, n, s, j, d, c
as defined in that language specific elision filter configuration - for Italian:
c, l, all, dall, dell, nell, sull, coll, pell, gl, agl, dagl, degl, negl, sugl, un, m, t, s, v, d
as defined in that language specific elision filter configuration - for Catalan:
d, l, m, n, s, t
as defined in that language specific elision filter configuration
Should this changes affect you negatively, please be aware that you can override those settings (as all of those in the elasticsuite_analysis.xml configuration file) in a custom module.
Feel also free to let us know, especially about the configuration Italian and Catalan since we are not native speakers and we are just following official Elasticsearch recommendations.
📦 Features
- [Community] Integrating HS form in dashboard. by @romainruaud in #3155
- [Thesaurus] Feature #3115, add a warning message for existing terms in other thesaurus by @vahonc in #3128
🐛 Fixes
- [Analytics] Exclude search usage chart type from i18n by @rbayet in #3161
- [Search] Fix french stemmer on "clef/clefs" and better elision management by @PierreGauthier in #3156
Full Changelog: 2.10.19...2.10.19.1
2.11.5
Main feature
Due to breaking changes in the Elasticsearch PHP client on one side and our comitment for the 2.11.x releases to be still compatible with Elasticsearch 7 and Elasticsearch 8 as well as to ensure our compatibility with a growing installation base of Magentos using OpenSearch on the other side, we've decided to switch to the OpenSearch PHP client which addresses all flavours (ES 7, ES 8, OS 1, OS 2) seamlessly.
📦 Features
- [Core] Replace the Elasticsearch client by the Opensearch client. by @romainruaud in #3131
- [Configuration] update of BO comment LU-140 related to SSL certificate validation and OpenSearch by @gabrielLumao in #3152
🐛 Fixes
- [Analytics] Make the dashboard display handle long search terms by @rbayet in #3147
- [Layered Navigation] Fixes #3148 Replacing _term agg sort order by _key by @rbayet in #3150
Full Changelog: 2.11.4.3...2.11.5
2.10.19
2.11.4.3
Main new feature
Tools to check and fix invalid behavioral data
When the Elasticsuite tracker is enabled on a site with a custom theme or on a PWA frontend, it can happen that the data collection is not correctly performed, particularly that some events are registred with an undefined or 'null' user session identifier (tracker session.uid
parameter).
In turn those events coming from different visitors (tracker session.vid
parameter) would still be collected as event for a same navigation session spanning several weeks or months, generating a document in the behavioral data session index with several hundreds or thousands of search terms, products added to cart, ordered, etc.
In the long run, it will slow down requests performed on the behavioral data indices up to generate 429 errors on your Elasticsearch/OpenSearch server preventing you to use, for instance, the Elasticsuite Search Usage analytics dashboard.
This release contains both a fix to prevent the collection of those ill-formed events and two Magento commands to be able to check and fix your already indexed behavioral data.
Check the presence of invalid behavioral data
You can run the elasticsuite:tracker:check-data
Magento command to scan your behavioral indices for invalid data.
It will scan the behavioral data indices of all the active Magento store views and report if there are any errors.
If there are errors for a given store, they will be reported, as seen below :
Fix the invalid behavioral data.
If the elasticsuite:tracker:check-data
command reports error, you can run the elasticsuite:tracker:fix-data
command to fix the invalid data from your behavioral indices.
It will report what has been fixed.
For instance, on the same example as seen above, here are the results :
If there was nothing to fix, the command has no effect :
Future
As for the system message that can pop in your Magento admin interface when you have "Ghost Indices", we might in a future release make sure that admin users are made aware of existing invalid behavioral data without having to launch the elasticsuite:tracker:check-data
Magento command.
📦 Features
- [Configuration] Added configuration for ssl verification by @gabrielLumao in #3127
- [Tracker] Tools to check and fix invalid behavioral data by @rbayet in #3122
- [Tracker] Prevent partial/invalid events to be indexed by @rbayet in #3111
🐛 Fixes
- Fixes #3137 [Analytics] PHP8 compatibility popular search terms w/ re… by @rbayet in #3138
- Fixes #3119 [GraphQL] allow drill-down in categories agg. of products… by @rbayet in #3120
- Fixes #3123 [Search Merchandizing] Show back hidden products in preview by @rbayet in #3124
- Fixes #3134 [Catalog] Typo in decimal layered navigation filter by @rbayet in #3141
- Fixes #3132 [Thesaurus] limit thesaurus rewriting loops by @rbayet in #3133
- Fixes #3132 [Thesaurus] Rewriting loops avoidance unit tests by @rbayet in #3145
- Fixes #2913 [Tracker] undefined array key date for elasticsuite tracker event index by @rbayet in #3112
- [Tracker] Remove 'domain' parameter from tracker by @romainruaud in 7bc0b3c
New Contributors
- @gabrielLumao made their first contribution in #3127
Full Changelog: 2.11.4.2...2.11.4.3