From e7ff90e38f5288cafd2de23486b3fb4f52e02812 Mon Sep 17 00:00:00 2001 From: Cassandra Targett Date: Thu, 3 Jun 2021 12:00:22 -0500 Subject: [PATCH] Ref Guide: typos, headline case, abbreviations, etc., for 8.9 --- .../src/basic-authentication-plugin.adoc | 2 +- .../src/cluster-node-management.adoc | 2 +- .../src/collapse-and-expand-results.adoc | 13 ++++++----- .../src/collection-management.adoc | 2 +- solr/solr-ref-guide/src/config-api.adoc | 4 ++-- solr/solr-ref-guide/src/configsets-api.adoc | 22 +++++++++---------- solr/solr-ref-guide/src/de-duplication.adoc | 10 ++++----- .../src/distributed-requests.adoc | 2 +- solr/solr-ref-guide/src/docvalues.adoc | 2 +- solr/solr-ref-guide/src/graph.adoc | 3 +-- .../src/indexing-nested-documents.adoc | 4 ++-- solr/solr-ref-guide/src/json-request-api.adoc | 4 +++- .../src/jwt-authentication-plugin.adoc | 6 ++--- .../src/making-and-restoring-backups.adoc | 8 +++---- solr/solr-ref-guide/src/other-parsers.adoc | 8 +++---- .../src/pagination-of-results.adoc | 2 +- solr/solr-ref-guide/src/reindexing.adoc | 2 +- solr/solr-ref-guide/src/resource-loading.adoc | 6 ++--- .../src/searching-nested-documents.adoc | 6 ++--- .../src/solr-upgrade-notes.adoc | 6 ++--- .../src/updating-parts-of-documents.adoc | 10 +++++---- 21 files changed, 65 insertions(+), 59 deletions(-) diff --git a/solr/solr-ref-guide/src/basic-authentication-plugin.adoc b/solr/solr-ref-guide/src/basic-authentication-plugin.adoc index 37d274360fdd..65573ebef08d 100644 --- a/solr/solr-ref-guide/src/basic-authentication-plugin.adoc +++ b/solr/solr-ref-guide/src/basic-authentication-plugin.adoc @@ -213,7 +213,7 @@ QueryResponse rsp = req.process(solrClient); While this is method is simple, it can often be inconvenient to ensure the credentials are provided everywhere they're needed. It also doesn't work with the many `SolrClient` methods which don't consume `SolrRequest` objects. -=== Per-Client credentials +=== Per-Client Credentials Http2SolrClient supports setting the credentials at the client level when building it. This will ensure all requests issued with this particular client get the Basic Authentication headers set. [source,java] diff --git a/solr/solr-ref-guide/src/cluster-node-management.adoc b/solr/solr-ref-guide/src/cluster-node-management.adoc index 7a13227bb648..baa1ea8ae03f 100644 --- a/solr/solr-ref-guide/src/cluster-node-management.adoc +++ b/solr/solr-ref-guide/src/cluster-node-management.adoc @@ -42,7 +42,7 @@ the percentage of active replicas (`active`): `RED`:: No active replicas *OR* there's no shard leader. -The collection health state is reported as the worst state of any shard, e.g. for a +The collection health state is reported as the worst state of any shard, e.g., for a collection with all shards GREEN except for one YELLOW the collection health will be reported as YELLOW. diff --git a/solr/solr-ref-guide/src/collapse-and-expand-results.adoc b/solr/solr-ref-guide/src/collapse-and-expand-results.adoc index 379136a8e3b0..6f32487a42c4 100644 --- a/solr/solr-ref-guide/src/collapse-and-expand-results.adoc +++ b/solr/solr-ref-guide/src/collapse-and-expand-results.adoc @@ -138,19 +138,19 @@ fq={!collapse cost=1000 field=group_field} === Block Collapsing -When collapsing on the `\_root_` field, using `nullPolicy=expand` or `nullPolicy=ignore`, the Collapsing Query Parser can take advantage of the fact that all docs with identical field values are adjacent to eachother in the index in a single <>. This allows the collapsing logic to be much faster and more memory efficient. +When collapsing on the `\_root_` field, using `nullPolicy=expand` or `nullPolicy=ignore`, the Collapsing Query Parser can take advantage of the fact that all docs with identical field values are adjacent to each other in the index in a single <>. This allows the collapsing logic to be much faster and more memory efficient. The default collapsing logic must keep track of all group head documents -- for all groups encountered so far -- until it has evaluated all documents, because each document it considers may become the new group head of any group. When collapsing on the `\_root_` field however, the logic knows that as it scans over the index, it will never encounter any new documents in a group that it has previously processed. -This more efficient logic can also be used with other `collapseField` values, via the `hint=block` local param. This can be useful when you have deeply nested documents and you'd like to collapse on a field that does not contain identical values for all documents with a common `\_root_` but is a unique & identical value for sets of contiguious documents under a common `\_root_`. For example: searching for "grand child" documents and collapsing on a field that is unique per "child document" +This more efficient logic can also be used with other `collapseField` values, via the `hint=block` local param. This can be useful when you have deeply nested documents and you'd like to collapse on a field that does not contain identical values for all documents with a common `\_root_` but is a unique and identical value for sets of contiguous documents under a common `\_root_`. For example: searching for "grand child" documents and collapsing on a field that is unique per "child document" [CAUTION] ==== -Specifing `hint=block` when collapsing on a field that is not unique per contiguious block of documents is not supported and may fail in unexpected ways; including the possibility of silently returning incorrect results. +Specifing `hint=block` when collapsing on a field that is not unique per contiguous block of documents is not supported and may fail in unexpected ways; including the possibility of silently returning incorrect results. -The implementation does not offer any safeguards against missuse on an unsupported field, since doing so would require the the same group level tracking as the non-Block collapsing implementation -- defeating the purpose of this optimization. +The implementation does not offer any safeguards against misuse on an unsupported field, since doing so would require the the same group level tracking as the non-Block collapsing implementation -- defeating the purpose of this optimization. ==== == Expand Component @@ -210,4 +210,7 @@ Overrides the main query (`q`), determines which documents to include in the mai Overrides main filter queries (`fq`), determines which documents to include in the main group. The default is to use the main filter queries. `expand.nullGroup`:: -Indicates if an expanded group can be returned containing documents with no value in the expanded field. This option only _enables_ support for returning a "null" expanded group: As with all expanded groups, it will only exist if the main group includes corresponding documents for it to expand (Via `collapse` using either `nullPolicy=collapse` or `nullPolicy=expand`; Or via `expand.q`) _and_ documents are found that belong in this expanded group. The default value is `false` +Indicates if an expanded group can be returned containing documents with no value in the expanded field. +This option only _enables_ support for returning a "null" expanded group. +As with all expanded groups, it will only exist if the main group includes corresponding documents for it to expand (via `collapse` using either `nullPolicy=collapse` or `nullPolicy=expand`; or via `expand.q`) _and_ documents are found that belong in this expanded group. +The default value is `false`. diff --git a/solr/solr-ref-guide/src/collection-management.adoc b/solr/solr-ref-guide/src/collection-management.adoc index a042dc5b56ce..1b2b55ee35c4 100644 --- a/solr/solr-ref-guide/src/collection-management.adoc +++ b/solr/solr-ref-guide/src/collection-management.adoc @@ -95,7 +95,7 @@ If this parameter is specified, the router will look at the value of the field i Please note that <> or retrieval by document ID would also require the parameter `\_route_` (or `shard.keys`) to avoid a distributed search. `perReplicaState`:: -If `true` the states of individual replicas will be maintained as individual child of the `state.json`. default is `false` +If `true` the states of individual replicas will be maintained as individual child of the `state.json`. The default is `false`. `property._name_=_value_`:: Set core property _name_ to _value_. See the section <> for details on supported properties and values. diff --git a/solr/solr-ref-guide/src/config-api.adoc b/solr/solr-ref-guide/src/config-api.adoc index 0661af52da0e..e9a74e3c7249 100644 --- a/solr/solr-ref-guide/src/config-api.adoc +++ b/solr/solr-ref-guide/src/config-api.adoc @@ -60,7 +60,7 @@ http://localhost:8983/api/collections/techproducts/config The response will be the Solr configuration resulting from merging settings in `configoverlay.json` with those in `solrconfig.xml`. -It's possible to restrict the returned config to a top-level section, such as, `query`, `requestHandler` or `updateHandler`. To do this, append the name of the section to the `config` endpoint. For example, to retrieve configuration for all request handlers: +It's possible to restrict the returned configuration to a top-level section, such as, `query`, `requestHandler` or `updateHandler`. To do this, append the name of the section to the `config` endpoint. For example, to retrieve configuration for all request handlers: [.dynamic-tabs] -- @@ -902,4 +902,4 @@ Any component can register a listener using: `SolrCore#addConfListener(Runnable listener)` -to get notified for config changes. This is not very useful if the files modified result in core reloads (i.e., `configoverlay.xml` or the schema). Components can use this to reload the files they are interested in. +to get notified for configuration changes. This is not very useful if the files modified result in core reloads (i.e., `configoverlay.xml` or the schema). Components can use this to reload the files they are interested in. diff --git a/solr/solr-ref-guide/src/configsets-api.adoc b/solr/solr-ref-guide/src/configsets-api.adoc index 5ccff8e4ee5e..ac492b1764fd 100644 --- a/solr/solr-ref-guide/src/configsets-api.adoc +++ b/solr/solr-ref-guide/src/configsets-api.adoc @@ -101,7 +101,7 @@ The configset to be created when the upload is complete. This parameter is requi `overwrite`:: If set to `true`, Solr will overwrite an existing configset with the same name (if false, the request will fail). -If `filePath` is provided, then this option specifies whether the specified file within the configSet should be overwritten if it already exists. +If `filePath` is provided, then this option specifies whether the specified file within the configset should be overwritten if it already exists. Default is `false` when using the v1 API, but `true` when using the v2 API. `cleanup`:: @@ -109,11 +109,11 @@ When overwriting an existing configset (`overwrite=true`), this parameter tells This parameter cannot be set to true when `filePath` is used. `filePath`:: -This parameter allows the uploading of a single, non-zipped file to the given path under the configSet in ZooKeeper. -This functionality respects the `overwrite` parameter, so a request will fail if the given file path already exists in the configSet and overwrite is set to `false`. +This parameter allows the uploading of a single, non-zipped file to the given path under the configset in ZooKeeper. +This functionality respects the `overwrite` parameter, so a request will fail if the given file path already exists in the configset and overwrite is set to `false`. The `cleanup` parameter cannot be set to true when `filePath` is used. -If uploading an entire configSet, the body of the request should be a zip file that contains the configset. The zip file must be created from within the `conf` directory (i.e., `solrconfig.xml` must be the top level entry in the zip file). +If uploading an entire configset, the body of the request should be a zip file that contains the configset. The zip file must be created from within the `conf` directory (i.e., `solrconfig.xml` must be the top level entry in the zip file). Here is an example on how to create the zip file named "myconfig.zip" and upload it as a configset named "myConfigSet": @@ -154,8 +154,8 @@ $ curl -X PUT --header "Content-Type:application/octet-stream" --data-binary @my "http://localhost:8983/api/cluster/configs/myConfigSet" ---- -With this REST API, the default behavior is to overwrite the configSet if it already exists. -This behavior can be disabled by providing the URL param `overwrite=false`, in which case the request will fail if the configSet already exists. +With this API, the default behavior is to overwrite the configset if it already exists. +This behavior can be disabled with the parameter `overwrite=false`, in which case the request will fail if the configset already exists. ==== -- @@ -168,7 +168,7 @@ Here is an example on how to upload a single file to a configset named "myConfig [.tab-label]*V1 API* With the v1 API, the `upload` command must be capitalized as `UPLOAD`. -The filename to upload is provided via the `filePath` URL param: +The filename to upload is provided via the `filePath` parameter: [source,bash] ---- @@ -183,7 +183,7 @@ curl -X POST --header "Content-Type:application/octet-stream" [.tab-label]*V2 API* With the v2 API, the name of the configset and file are both provided in the URL. -They can be substituted in `/cluster/configs/{config_name}/{file_name}`. +They can be substituted in `/cluster/configs/__config_name__/__file_name__`. The filename may be nested and include `/` characters. [source,bash] @@ -193,8 +193,8 @@ curl -X PUT --header "Content-Type:application/octet-stream" "http://localhost:8983/api/cluster/configs/myConfigSet/solrconfig.xml" ---- -With this REST API, the default behavior is to overwrite the file if it already exists within the configSet. -This behavior can be disabled by providing the URL param `overwrite=false`, in which case the request will fail if the file already exists within the configSet. +With this API, the default behavior is to overwrite the file if it already exists within the configset. +This behavior can be disabled with the parameter `overwrite=false`, in which case the request will fail if the file already exists within the configset. ==== -- @@ -248,7 +248,7 @@ curl -X POST -H 'Content-type: application/json' -d '{ http://localhost:8983/api/cluster/configs?omitHeader=true ---- -With the v2 API, ConfigSet properties can also be provided via the `properties` map: +With the v2 API, configset properties can also be provided via the `properties` map: [source,bash] ---- diff --git a/solr/solr-ref-guide/src/de-duplication.adoc b/solr/solr-ref-guide/src/de-duplication.adoc index c7918bf8341d..133cb78a5392 100644 --- a/solr/solr-ref-guide/src/de-duplication.adoc +++ b/solr/solr-ref-guide/src/de-duplication.adoc @@ -56,7 +56,7 @@ The `SignatureUpdateProcessorFactory` has to be registered in `solrconfig.xml` a The `SignatureUpdateProcessorFactory` takes several properties: -signatureClass:: +`signatureClass`:: A Signature implementation for generating a signature hash. The default is `org.apache.solr.update.processor.Lookup3Signature`. + The full classpath of the implementation must be specified. The available options are described above, the associated classpaths to use are: @@ -65,16 +65,16 @@ The full classpath of the implementation must be specified. The available option * `org.apache.solr.update.processor.MD5Signature` * `org.apache.solr.update.process.TextProfileSignature` -fields:: +`fields`:: The fields to use to generate the signature hash in a comma separated list. By default, all fields on the document will be used. -signatureField:: +`signatureField`:: The name of the field used to hold the fingerprint/signature. The field should be defined in `schema.xml`. The default is `signatureField`. -enabled:: +`enabled`:: Set to *false* to disable de-duplication processing. The default is *true*. -overwriteDupes:: +`overwriteDupes`:: If *true*, the default, when a document exists that already matches this signature, it will be overwritten. If you are using `overwriteDupes=true` the `signatureField` must be `indexed="true"` in your Schema. .Using `SignatureUpdateProcessorFactory` in SolrCloud diff --git a/solr/solr-ref-guide/src/distributed-requests.adoc b/solr/solr-ref-guide/src/distributed-requests.adoc index e916537bb48b..b6c4da9b230c 100644 --- a/solr/solr-ref-guide/src/distributed-requests.adoc +++ b/solr/solr-ref-guide/src/distributed-requests.adoc @@ -195,7 +195,7 @@ Query will be routed to nodes with same defined system properties as the current Examples: -* Prefer stable routing (keyed to client "sessionId" param) among otherwise equivalent replicas: +* Prefer stable routing (keyed to client "sessionId" parameter) among otherwise equivalent replicas: `shards.preference=replica.base:stable:hash:sessionId&sessionId=abc123` * Prefer PULL replicas: diff --git a/solr/solr-ref-guide/src/docvalues.adoc b/solr/solr-ref-guide/src/docvalues.adoc index 6aa5c3b035a2..2f05d08d0e94 100644 --- a/solr/solr-ref-guide/src/docvalues.adoc +++ b/solr/solr-ref-guide/src/docvalues.adoc @@ -79,7 +79,7 @@ If `docValues="true"` for a field, then DocValues will automatically be used any Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g., "`fl=*`") for search queries depending on the effective value of the `useDocValuesAsStored` parameter for each field. For schema versions >= 1.6, the implicit default is `useDocValuesAsStored="true"`. See <> & <> for more details. -When `useDocValuesAsStored="false"`, non-stored DocValues fields can still be explicitly requested by name in the <>, but will not match glob patterns (`"*"`). Note that returning DocValues along with "regular" stored fields at query time has performance implications that stored fields may not because DocValues are column-oriented and may therefore incur additional cost to retrieve for each returned document. Also note that while returning non-stored fields from DocValues, the values of a multi-valued field are returned in sorted order rather than insertion order and may have duplicates removed, see above. If you require the multi-valued fields to be returned in the original insertion order, then make your multi-valued field as stored (such a change requires reindexing). +When `useDocValuesAsStored="false"`, non-stored DocValues fields can still be explicitly requested by name in the <>, but will not match glob patterns (`"*"`). Note that returning DocValues along with "regular" stored fields at query time has performance implications that stored fields may not because DocValues are column-oriented and may therefore incur additional cost to retrieve for each returned document. Also note that while returning non-stored fields from DocValues, the values of a multi-valued field are returned in sorted order rather than insertion order and may have duplicates removed, see above. If you require the multi-valued fields to be returned in the original insertion order, then make your multi-valued field as stored (such a change requires reindexing). In cases where the query is returning _only_ docValues fields performance may improve since returning stored fields requires disk reads and decompression whereas returning docValues fields in the fl list only requires memory access. diff --git a/solr/solr-ref-guide/src/graph.adoc b/solr/solr-ref-guide/src/graph.adoc index 4dfea59f7d44..ef690312276a 100644 --- a/solr/solr-ref-guide/src/graph.adoc +++ b/solr/solr-ref-guide/src/graph.adoc @@ -289,7 +289,7 @@ one product. A basket with just butter and one other item more strongly recommen The `maxDocFreq` parameter can be used to limit the graph "walk" to only include baskets that appear in the index a certain number of times. Since each occurrence of a basket ID in the index is a link to a product, -limiting the document frequency of the basket ID will limit the out-degree of the basket. The `maxDocFreq` param is +limiting the document frequency of the basket ID will limit the out-degree of the basket. The `maxDocFreq` parameter is applied per shard. If there is a single shard or documents are co-located by basket ID then the `maxDocFreq` will be an exact count. Otherwise, it will return baskets with a max size of numShards * maxDocFreq. @@ -739,4 +739,3 @@ This score give us a good indication of where to begin our *root cause analysis* } } ---- - diff --git a/solr/solr-ref-guide/src/indexing-nested-documents.adoc b/solr/solr-ref-guide/src/indexing-nested-documents.adoc index 084dc01b9f40..bc859724f8ca 100644 --- a/solr/solr-ref-guide/src/indexing-nested-documents.adoc +++ b/solr/solr-ref-guide/src/indexing-nested-documents.adoc @@ -38,8 +38,8 @@ Nested documents may be indexed via either the XML or JSON data syntax, and is a [CAUTION] ==== -.Re-Indexing Considerations -With the exception of in-place updates, Solr must internally re-index an entire nested document tree +.Reindexing Considerations +With the exception of in-place updates, Solr must internally reindex an entire nested document tree if there are updates to it. For some applications this may result in a lot of extra indexing overhead that may not be worth the performance gains at query time versus other modeling approaches. diff --git a/solr/solr-ref-guide/src/json-request-api.adoc b/solr/solr-ref-guide/src/json-request-api.adoc index 7fa758c9b32c..811bba98fde5 100644 --- a/solr/solr-ref-guide/src/json-request-api.adoc +++ b/solr/solr-ref-guide/src/json-request-api.adoc @@ -19,7 +19,9 @@ // specific language governing permissions and limitations // under the License. -Solr supports an alternate request API which accepts requests composed in part or entirely of JSON objects. This alternate API can be preferable in some situations, where its increased readability and flexibility make it easier to use than the entirely query-param driven alternative. There is also some functionality which can only be accessed through this JSON request API, such as much of the analytics capabilities of <> +Solr supports an alternate request API which accepts requests composed in part or entirely of JSON objects. +This alternate API can be preferable in some situations, where its increased readability and flexibility make it easier to use than the entirely query-parameter driven alternative. +There is also some functionality which can only be accessed through this JSON request API, such as much of the analytics capabilities of <> == Building JSON Requests The core of the JSON Request API is its ability to specify request parameters as JSON in the request body, as shown by the example below: diff --git a/solr/solr-ref-guide/src/jwt-authentication-plugin.adoc b/solr/solr-ref-guide/src/jwt-authentication-plugin.adoc index 9742924de83c..4ae7c3f688f1 100644 --- a/solr/solr-ref-guide/src/jwt-authentication-plugin.adoc +++ b/solr/solr-ref-guide/src/jwt-authentication-plugin.adoc @@ -162,10 +162,10 @@ Let's comment on this config: <12> Configure the audience claim. A token's 'aud' claim must match 'aud' for one of the configured issuers. <13> This issuer is auto configured through discovery, so 'iss' and JWK settings are not required -=== Using non SSL URLs +=== Using non-SSL URLs In production environments you should always use SSL protected HTTPS connections, otherwise you open yourself up to attacks. -However, in development, it may be useful to use regular http urls, and bypass the -security check that Solr performs. To support this you can set the environment variable `solr.auth.jwt.allowOutboundHttp=true`. +However, in development, it may be useful to use regular HTTP URLs, and bypass the security check that Solr performs. +To support this you can set the environment variable `-Dsolr.auth.jwt.allowOutboundHttp=true` at startup. == Editing JWT Authentication Plugin Configuration diff --git a/solr/solr-ref-guide/src/making-and-restoring-backups.adoc b/solr/solr-ref-guide/src/making-and-restoring-backups.adoc index ff9b055686fb..4a156f18060d 100644 --- a/solr/solr-ref-guide/src/making-and-restoring-backups.adoc +++ b/solr/solr-ref-guide/src/making-and-restoring-backups.adoc @@ -224,7 +224,7 @@ Request ID to track this action which will be processed asynchronously. == Backup/Restore Storage Repositories Solr provides a repository abstraction to allow users to backup and restore their data to a variety of different storage systems. -For example, a Solr cluster running on a local filesystem (e.g. EXT3) can store backup data on the same disk, on a remote network-mounted drive, in HDFS, or even in some popular "cloud storage" providers, depending on the 'repository' implementation chosen. +For example, a Solr cluster running on a local filesystem (e.g., EXT3) can store backup data on the same disk, on a remote network-mounted drive, in HDFS, or even in some popular "cloud storage" providers, depending on the 'repository' implementation chosen. Solr offers three different repository implementations out of the box (`LocalFileSystemRepository`, `HdfsBackupRepository`, and `GCSBackupRepository`), and allows users to create plugins for their own storage systems as needed. Users can define any number of repositories in their `solr.xml` file. @@ -331,16 +331,16 @@ GCSBackupRepository accepts the following advanced client-configuration options: `gcsWriteBufferSizeBytes`:: The buffer size, in bytes, to use when sending data to GCS. -16777216 bytes (i.e. 16 MB) is used by default if not specified. +`16777216` bytes (i.e., 16 MB) is used by default if not specified. `gcsReadBufferSizeBytes`:: The buffer size, in bytes, to use when copying data from GCS. -2097152 bytes (i.e. 2 MB) is used by default if not specified. +`2097152` bytes (i.e., 2 MB) is used by default if not specified. `gcsClientHttpConnectTimeoutMillis`:: The connection timeout, in milliseconds, for all HTTP requests made by the GCS client. "0" may be used to request an infinite timeout. -A negative integer, or not specifying a value at all, will result in a value of 20000 (or 20 seconds). +A negative integer, or not specifying a value at all, will result in a value of `20000` (or 20 seconds). `gcsClientHttpReadTimeoutMillis`:: The read timeout, in milliseconds, for reading data on an established connection. diff --git a/solr/solr-ref-guide/src/other-parsers.adoc b/solr/solr-ref-guide/src/other-parsers.adoc index f4cfb40cad41..7441ec5f5dbc 100644 --- a/solr/solr-ref-guide/src/other-parsers.adoc +++ b/solr/solr-ref-guide/src/other-parsers.adoc @@ -173,7 +173,7 @@ When subordinate clause (``) is omitted, it's parsed as a _segment [#block-mask] === Block Masks: The `of` and `which` local params -The purpose of the "Block Mask" query specified as either an `of` or `which` param (depending on the parser used) is to identy the set of all documents in the index which should be treated as "parents" _(or their ancestors)_ and which documents should be treated as "children". This is important because in the "on disk" index, the relationships are flattened into "blocks" of documents, so the `of` / `which` params are needed to serve as a "mask" against the flat document blocks to identify the boundaries of every hierarchical relationship. +The purpose of the "Block Mask" query specified as either an `of` or `which` parameter (depending on the parser used) is to identy the set of all documents in the index which should be treated as "parents" _(or their ancestors)_ and which documents should be treated as "children". This is important because in the "on disk" index, the relationships are flattened into "blocks" of documents, so the `of` / `which` params are needed to serve as a "mask" against the flat document blocks to identify the boundaries of every hierarchical relationship. In the example queries above, we were able to use a very simple Block Mask of `doc_type:parent` because our data is very simple: every document is either a `parent` or a `child` So this query string easily distinguishes _all_ of our documents. @@ -184,7 +184,7 @@ A common mistake is to try and use a `which` parameter that is more restrictive q={!parent which="title:join"}comments:support ---- -This type of query will frequently not work the way you might expect. Since the `which` param only identifies _some_ of the "parent" documents, the resulting query can match "parent" documents it should not, because it will mistakenly identify all documents which do _not_ match the `which="title:join"` Block Mask as children of the next "parent" document in the index (that does match this Mask). +This type of query will frequently not work the way you might expect. Since the `which` parameter only identifies _some_ of the "parent" documents, the resulting query can match "parent" documents it should not, because it will mistakenly identify all documents which do _not_ match the `which="title:join"` Block Mask as children of the next "parent" document in the index (that does match this Mask). A similar problematic situation can arise when mixing parent/child documents with "simple" documents that have no children _and do not match the query used to identify 'parent' documents_. For example, if we add the following document to our existing parent/child example documents: @@ -648,7 +648,7 @@ The hash range query parser uses a special cache to improve the speedup of the q == Join Query Parser The Join query parser allows users to run queries that normalize relationships between documents. -Solr runs a subquery of the user's choosing (the `v` param), identifies all the values that matching documents have in a field of interest (the `from` param), and then returns documents where those values are contained in a second field of interest (the `to` param). +Solr runs a subquery of the user's choosing (the `v` parameter), identifies all the values that matching documents have in a field of interest (the `from` parameter), and then returns documents where those values are contained in a second field of interest (the `to` parameter). In practice, these semantics are much like "inner queries" in a SQL engine. As an example, consider the Solr query below: @@ -1118,7 +1118,7 @@ The non-unary operators (everything but `NOT`) support both infix `(a AND b AND `SwitchQParser` is a `QParserPlugin` that acts like a "switch" or "case" statement. -The primary input string is trimmed and then prefixed with `case.` for use as a key to lookup a "switch case" in the parser's local params. If a matching local param is found the resulting param value will then be parsed as a subquery, and returned as the parse result. +The primary input string is trimmed and then prefixed with `case.` for use as a key to lookup a "switch case" in the parser's local params. If a matching local param is found the resulting parameter value will then be parsed as a subquery, and returned as the parse result. The `case` local param can be optionally be specified as a switch case to match missing (or blank) input strings. The `default` local param can optionally be specified as a default case to use if the input string does not match any other switch case local params. If default is not specified, then any input which does not match a switch case local param will result in a syntax error. diff --git a/solr/solr-ref-guide/src/pagination-of-results.adoc b/solr/solr-ref-guide/src/pagination-of-results.adoc index abfbdd9cdaa1..f76eca9a9ea7 100644 --- a/solr/solr-ref-guide/src/pagination-of-results.adoc +++ b/solr/solr-ref-guide/src/pagination-of-results.adoc @@ -107,7 +107,7 @@ When the `responseHeader` no longer includes `"partialResults": true`, and `curs If `id` is your uniqueKey field, then sort parameters like `id asc` and `name asc, id desc` would both work fine, but `name asc` by itself would not . Sorts including <> based functions that involve calculations relative to `NOW` will cause confusing results, since every document will get a new sort value on every subsequent request. This can easily result in cursors that never end, and constantly return the same documents over and over – even if the documents are never updated. + -In this situation, choose & re-use a fixed value for the <> in all of your cursor requests. +In this situation, choose & re-use a fixed value for the <> in all of your cursor requests. Cursor mark values are computed based on the sort values of each document in the result, which means multiple documents with identical sort values will produce identical Cursor mark values if one of them is the last document on a page of results. In that situation, the subsequent request using that `cursorMark` would not know which of the documents with the identical mark values should be skipped. Requiring that the uniqueKey field be used as a clause in the sort criteria guarantees that a deterministic ordering will be returned, and that every `cursorMark` value will identify a unique point in the sequence of documents. diff --git a/solr/solr-ref-guide/src/reindexing.adoc b/solr/solr-ref-guide/src/reindexing.adoc index e2b2b2ebafa1..8c5e491c8481 100644 --- a/solr/solr-ref-guide/src/reindexing.adoc +++ b/solr/solr-ref-guide/src/reindexing.adoc @@ -83,7 +83,7 @@ Lucene works hard to insure one major version back-compatability, thus Solr 8x f If you have *not* changed your schema as part of an upgrade from one minor release to another (such as, from 7.x to a later 7.x release), you can often skip reindexing your documents. However, when upgrading to a major release, you should plan to reindex your documents. [NOTE] -You must always re-index your corpus when upgrading an index produced with a Solr version more than X-1 old. For instance, if you're upgrading to Solr 8x, an index ever used by Solr 6x must be deleted and re-ingested as outlined below. A marker is written identifying the version of Lucene used to ingest the first document. That marker is preserved in the index forever unless the index is entirely deleted. If Lucene finds a marker more than X-1 major versions old, it will refuse to open the index. +You must always reindex your corpus when upgrading an index produced with a Solr version more than X-1 old. For instance, if you're upgrading to Solr 8x, an index ever used by Solr 6x must be deleted and re-ingested as outlined below. A marker is written identifying the version of Lucene used to ingest the first document. That marker is preserved in the index forever unless the index is entirely deleted. If Lucene finds a marker more than X-1 major versions old, it will refuse to open the index. == Reindexing Strategies diff --git a/solr/solr-ref-guide/src/resource-loading.adoc b/solr/solr-ref-guide/src/resource-loading.adoc index e0e457b7e477..98af2269ee2d 100644 --- a/solr/solr-ref-guide/src/resource-loading.adoc +++ b/solr/solr-ref-guide/src/resource-loading.adoc @@ -20,7 +20,7 @@ Solr components can be configured using *resources*: data stored in external files that may be referred to in a location-independent fashion. Examples of resources include: files needed by schema components, e.g., a stopword list for <>; and machine-learned models for <>. -_Resources are typically resolved from the configSet_ but there are other options too. +_Resources are typically resolved from the configset_ but there are other options too. Solr's resources are generally only loaded initially when the Solr collection or Solr core is loaded. After you update a resource, you'll typically need to _reload_ the affected collections (SolrCloud) or the cores (standalone Solr). @@ -31,7 +31,7 @@ Restarting all affected Solr nodes also works. <> are the directories containing solrconfig.xml, the schema, and resources referenced by them. In SolrCloud they are in ZooKeeper whereas in standalone they are on the file system. -In either mode, configSets might be shared or might be dedicated to a configSet. +In either mode, configsets might be shared or might be dedicated to a configset. Prefer to put resources here. == Resources in Other Places @@ -41,4 +41,4 @@ This choice may make sense if the resource is too large for a configset in ZooKe However it's up to you to somehow ensure that all nodes in your cluster have access to these resources. Finally, and this is very unusual, resources can also be packaged inside `.jar` files from which they will be referenced. -That might make sense for default resources wherein a plugin user can override it via placing the same-named file in a configSet. +That might make sense for default resources wherein a plugin user can override it via placing the same-named file in a configset. diff --git a/solr/solr-ref-guide/src/searching-nested-documents.adoc b/solr/solr-ref-guide/src/searching-nested-documents.adoc index 9190fb2cd6ac..abe387c56185 100644 --- a/solr/solr-ref-guide/src/searching-nested-documents.adoc +++ b/solr/solr-ref-guide/src/searching-nested-documents.adoc @@ -145,7 +145,7 @@ $ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' - In this example we've used `\*:* -\_nest_path_:*` as our <> to indicate we want to consider all documents which don't have a nest path -- i.e., all "root" level document -- as the set of possible parents. -By changing the `of` param to match ancestors at specific `\_nest_path_` levels, we can narrow down the list of children we return. +By changing the `of` parameter to match ancestors at specific `\_nest_path_` levels, we can narrow down the list of children we return. In the query below, we search for all descendants of `skus` (using an `of` parameter that identifies all documents that do _not_ have a `\_nest_path_` with the prefix `/skus/*`) with a `price_i` less then `50`: [source,curl] @@ -171,7 +171,7 @@ Note that in the above example, the `/` characters in the `\_nest_path_` were "d * One level of `\` escaping is necessary to prevent the `/` from being interpreted as a {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches[Regex Query] * An additional level of "escaping the escape character" is necessary because the `of` local parameter is a quoted string; so we need a second `\` to ensure the first `\` is preserved and passed as is to the query parser. -(You can see that only a single level of of `\` escaping is needed in the body of the query string -- to prevent the Regex syntax -- because it's not a quoted string local param) +(You can see that only a single level of of `\` escaping is needed in the body of the query string -- to prevent the Regex syntax -- because it's not a quoted string local param). You may find it more convenient to use <> in conjunction with <> that do not treat `/` as a special character to express the same query in a more verbose form: @@ -235,7 +235,7 @@ $ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' - In this example we've used `\*:* -\_nest_path_:*` as our <> to indicate we want to consider all documents which don't have a nest path -- i.e., all "root" level document -- as the set of possible parents. -By changing the `which` param to match ancestors at specific `\_nest_path_` levels, we can change the type of ancestors we return. In the query below, we search for `skus` (using an `which` param that identifies all documents that do _not_ have a `\_nest_path_` with the prefix `/skus/*`) that are the ancestors of `manuals` with exactly `1` page: +By changing the `which` parameter to match ancestors at specific `\_nest_path_` levels, we can change the type of ancestors we return. In the query below, we search for `skus` (using an `which` parameter that identifies all documents that do _not_ have a `\_nest_path_` with the prefix `/skus/*`) that are the ancestors of `manuals` with exactly `1` page: [source,curl] ---- diff --git a/solr/solr-ref-guide/src/solr-upgrade-notes.adoc b/solr/solr-ref-guide/src/solr-upgrade-notes.adoc index f3c5a2ee534e..38b50defad60 100644 --- a/solr/solr-ref-guide/src/solr-upgrade-notes.adoc +++ b/solr/solr-ref-guide/src/solr-upgrade-notes.adoc @@ -71,7 +71,7 @@ The Solr community is working to add support for S3 buckets in the near future. *Nested Docs* -Child Doc Transformer's `childFilter` param no longer applies query syntax +Child Doc Transformer's `childFilter` parameter no longer applies query syntax escaping because it's inconsistent with the rest of Solr and was limiting. This refers to `[child childFilter='field:value']`. There was no escaping here prior to 8.0 either. @@ -151,8 +151,8 @@ base_url removed from stored state* As of Solr 8.8.0, the `base_url` property was removed from the stored state for replicas (SOLR-12182). If you're able to upgrade SolrJ to 8.8.x for all of your client applications, then you can set `-Dsolr.storeBaseUrl=false` (introduced in Solr 8.8.1) to better align the stored state -in Zookeeper with future versions of Solr. However, if you are not able to upgrade SolrJ to 8.8.x for all client applications, -then leave the default `-Dsolr.storeBaseUrl=true` so that Solr will continue to store the `base_url` in Zookeeper. +in ZooKeeper with future versions of Solr. However, if you are not able to upgrade SolrJ to 8.8.x for all client applications, +then leave the default `-Dsolr.storeBaseUrl=true` so that Solr will continue to store the `base_url` in ZooKeeper. You may also see some NPE in collection state updates during a rolling upgrade to 8.8.0 from a previous version of Solr. After upgrading all nodes in your cluster to 8.8.0, collections should fully recover. Trigger another rolling restart if there are any replicas that do not recover after the upgrade to re-elect leaders. diff --git a/solr/solr-ref-guide/src/updating-parts-of-documents.adoc b/solr/solr-ref-guide/src/updating-parts-of-documents.adoc index 094e1192db57..c82c21fc25fd 100644 --- a/solr/solr-ref-guide/src/updating-parts-of-documents.adoc +++ b/solr/solr-ref-guide/src/updating-parts-of-documents.adoc @@ -129,10 +129,10 @@ Solr offers two solutions to address this: Furthermore, you _should_ (sometimes _must_) specify the Root document's ID in the `\_root_` field of this partial update. This is how Solr understands that you are updating a child -document, and not a Root document. Without it, Solr only guesses that the `\_route_` param is +document, and not a Root document. Without it, Solr only guesses that the `\_route_` parameter is equivalent, but it may be absent or not equivalent (e.g., when using the `implicit` router). -All of the examples below use `id` prefixes, so no `\_route_` param will be necessary for these examples. +All of the examples below use `id` prefixes, so no `\_route_` parameter will be necessary for these examples. ==== For the upcoming examples, we'll assume an index containing the same documents covered in <>: @@ -229,9 +229,11 @@ Increments a numeric value by a specific amount. Must be specified as a single n [TIP] ==== -.Preventing Atomic Updates that can't be done In-Place +.Preventing Atomic Updates That Can't be Done In-Place -Since it can be tricky to ensure that all of the neccessary conditions are satisfied to ensure that an udpate can be don In-Place, Solr supports a request parameter option named `update.partial.requireInPlace`. When set to `true`, and Atomic Update that can not be don In-Place will fail. Users can specify this option when they would prefer that an update request "fail fast" if it can't be done In-Place. +Since it can be tricky to ensure that all of the necessary conditions are satisfied to ensure that an update can be done In-Place, Solr supports a request parameter option named `update.partial.requireInPlace`. +When set to `true`, an atomic update that can not be done In-Place will fail. +Users can specify this option when they would prefer that an update request "fail fast" if it can't be done In-Place. ====