From 7fc493cbcb9d539fee8c8d9f7d7162aac2decdce Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Sun, 24 Nov 2024 15:58:00 +0200 Subject: [PATCH 01/10] related and edits --- .../data-ingestion/ingest-from-query.md | 55 ++++++++++++------- 1 file changed, 36 insertions(+), 19 deletions(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md index c3af2923e3..2a189d7cb9 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md @@ -3,7 +3,7 @@ title: Kusto query ingestion (set, append, replace) description: Learn how to use the .set, .append, .set-or-append, and .set-or-replace commands to ingest data from a query. ms.reviewer: orspodek ms.topic: reference -ms.date: 08/11/2024 +ms.date: 11/24/2024 --- # Ingest from query (.set, .append, .set-or-append, .set-or-replace) @@ -25,7 +25,7 @@ To cancel an ingest from query command, see [`cancel operation`](../cancel-opera ## Permissions -To perform different actions on a table, specific permissions are required: +To perform different actions on a table, you need specific permissions: * To add rows to an existing table using the `.append` command, you need a minimum of Table Ingestor permissions. * To create a new table using the various `.set` commands, you need a minimum of Database User permissions. @@ -60,11 +60,11 @@ For more information on permissions, see [Kusto role-based access control](../.. |--|--|--| |`distributed` | `bool` | If `true`, the command ingests from all nodes executing the query in parallel. Default is `false`. See [performance tips](#performance-tips).| |`creationTime` | `string` | The datetime value, formatted as an ISO8601 string, to use at the creation time of the ingested data extents. If unspecified, `now()` is used. When specified, make sure the `Lookback` property in the target table's effective [Extents merge policy](../merge-policy.md) is aligned with the specified value.| -|`extend_schema` | `bool` | If `true`, the command may extend the schema of the table. Default is `false`. This option applies only to `.append`, `.set-or-append`, and `set-or-replace` commands. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.| -|`recreate_schema` | `bool` | If `true`, the command may recreate the schema of the table. Default is `false`. This option applies only to the `.set-or-replace` command. This option takes precedence over the `extend_schema` property if both are set. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.| +|`extend_schema` | `bool` | If `true`, the command might extend the schema of the table. Default is `false`. This option applies only to `.append`, `.set-or-append`, and `set-or-replace` commands. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.| +|`recreate_schema` | `bool` | If `true`, the command might recreate the schema of the table. Default is `false`. This option applies only to the `.set-or-replace` command. This option takes precedence over the `extend_schema` property if both are set. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.| |`folder` | `string` | The folder to assign to the table. If the table already exists, this property overwrites the table's folder.| |`ingestIfNotExists` | `string` | If specified, ingestion fails if the table already has data tagged with an `ingest-by:` tag with the same value. For more information, see [ingest-by: tags](../extent-tags.md).| -|`policy_ingestiontime` | `bool` | If `true`, the [Ingestion Time Policy](../show-table-ingestion-time-policy-command.md) will be enabled on the table. The default is `true`.| +|`policy_ingestiontime` | `bool` | If `true`, the [Ingestion Time Policy](../show-table-ingestion-time-policy-command.md) is enabled on the table. The default is `true`.| |`tags` | `string` | A JSON string that represents a list of [tags](../extent-tags.md) to associate with the created extent. | |`docstring` | `string` | A description used to document the table.| |`persistDetails` |A Boolean value that, if specified, indicates that the command should persist the detailed results for retrieval by the [.show operation details](../show-operations.md) command. Defaults to `false`. |`with (persistDetails=true)`| @@ -73,10 +73,10 @@ For more information on permissions, see [Kusto role-based access control](../.. * `.set-or-replace` preserves the schema unless one of `extend_schema` or `recreate_schema` ingestion properties is set to `true`. * `.set-or-append` and `.append` commands preserve the schema unless the `extend_schema` ingestion property is set to `true`. -* Matching the result set schema to that of the target table is based on the column types. There's no matching of column names. Make sure that the query result schema columns are in the same order as the table, else data will be ingested into the wrong columns. +* Matching the result set schema to that of the target table is based on the column types. There's no matching of column names. Make sure that the query result schema columns are in the same order as the table, otherwise data is ingested into the wrong columns. > [!CAUTION] -> If the schema is modified, it happens in a separate transaction before the actual data ingestion. This means the schema may be modified even when there is a failure to ingest the data. +> If the schema is modified, it happens in a separate transaction before the actual data ingestion. This means the schema might be modified even when there is a failure to ingest the data. ## Character limitation @@ -88,9 +88,15 @@ For example, in the following query, the `search` operator generates a column `$ .set Texas <| search State has 'Texas' | project-rename tableName=$table ``` +## Returns + +Returns information on the extents created because of the `.set` or `.append` command. + ## Examples -Create a new table called :::no-loc text="RecentErrors"::: in the database that has the same schema as :::no-loc text="LogsTable"::: and holds all the error records of the last hour. +### Create and update table from query source + +The following query creates the :::no-loc text=`RecentErrors`::: table with the same schema as :::no-loc text=`LogsTable`:::. It updates :::no-loc text=`RecentErrors`::: with all error logs from :::no-loc text=`LogsTable`::: over the last hour. ```kusto .set RecentErrors <| @@ -98,7 +104,9 @@ Create a new table called :::no-loc text="RecentErrors"::: in the database that | where Level == "Error" and Timestamp > now() - time(1h) ``` -Create a new table called "OldExtents" in the database that has a single column, "ExtentId", and holds the extent IDs of all extents in the database that were created more than 30 days ago. The database has an existing table named "MyExtents". Since the dataset is expected to be bigger than 1 GB (more than ~1 million rows) use the *distributed* flag +### Create and update table from query source using the *distributed* flag + +The following example creates a new table called `OldExtents` in the database asynchronously. The dataset is expected to be bigger than one GB (more than ~1 million rows) so the *distributed* flag is used. It updates `OldExtents` with data with `ExtentId` entries from the `MyExtents` table that were created more than 30 days ago. ```kusto .set async OldExtents with(distributed=true) <| @@ -107,8 +115,9 @@ Create a new table called "OldExtents" in the database that has a single column, | project ExtentId ``` -Append data to an existing table called "OldExtents" in the current database that has a single column, "ExtentId", and holds the extent IDs of all extents in the database that have been created more than 30 days earlier. -Mark the new extent with tags `tagA` and `tagB`, based on an existing table named "MyExtents". +### Append data to table + +The following example filters `ExtentId` entries in the `MyExtents` table that were created more than 30 days ago and appends the entries to the `OldExtents` table with associated tags. ```kusto .append OldExtents with(tags='["TagA","TagB"]') <| @@ -117,7 +126,9 @@ Mark the new extent with tags `tagA` and `tagB`, based on an existing table name | project ExtentId ``` -Append data to the "OldExtents" table in the current database, or create the table if it doesn't already exist. Tag the new extent with `ingest-by:myTag`. Do so only if the table doesn't already contain an extent tagged with `ingest-by:myTag`, based on an existing table named "MyExtents". +### Create or append a table with possibly existing tagged data + +The following example either appends to or creates the `OldExtents` table asynchronously. It filters `ExtentId` entries in the `MyExtents` table that were created more than 30 days ago and specifies the tags to append to the new extents with `ingest-by:myTag`. The `ingestIfNotExists` parameter ensures that the ingestion only occurs if the data doesn't already exist in the table with the specified tag. ```kusto .set-or-append async OldExtents with(tags='["ingest-by:myTag"]', ingestIfNotExists='["myTag"]') <| @@ -126,7 +137,9 @@ Append data to the "OldExtents" table in the current database, or create the tab | project ExtentId ``` -Replace the data in the "OldExtents" table in the current database, or create the table if it doesn't already exist. Tag the new extent with `ingest-by:myTag`. +### Create table or replace data with associated data + +The following query replaces the data in the `OldExtents` table, or creates the table if it doesn't already exist, with `ExtentId` entries in the `MyExtents` table that were created more than 30 days ago. Tag the new extent with `ingest-by:myTag` if the data doesn't already exist in the table with the specified tag. ```kusto .set-or-replace async OldExtents with(tags='["ingest-by:myTag"]', ingestIfNotExists='["myTag"]') <| @@ -135,7 +148,9 @@ Replace the data in the "OldExtents" table in the current database, or create th | project ExtentId ``` -Append data to the "OldExtents" table in the current database, while setting the extents creation time to a specific datetime in the past. +### Append data with associated data + +The following example appends data to the `OldExtents` table asynchronously, using `ExtentId` entries from the `MyExtents` table that were created more than 30 days ago. It sets a specific creation time for the new extents. ```kusto .append async OldExtents with(creationTime='2017-02-13T11:09:36.7992775Z') <| @@ -144,12 +159,14 @@ Append data to the "OldExtents" table in the current database, while setting the | project ExtentId ``` -**Return output** - -Returns information on the extents created because of the `.set` or `.append` command. - -**Example output** +**Sample output** |ExtentId |OriginalSize |ExtentSize |CompressedSize |IndexSize |RowCount | |--|--|--|--|--|--| |23a05ed6-376d-4119-b1fc-6493bcb05563 |1291 |5882 |1568 |4314 |10 | + +## Related content + +* [Data formats supported for ingestion](../../ingestion-supported-formats.md) +* [Inline ingestion](ingest-inline.md) +* [Ingest from storage](ingest-from-storage.md) \ No newline at end of file From 32dc7ba2cd5af911209d51bd66622c40f5cecd1c Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Sun, 24 Nov 2024 16:28:26 +0200 Subject: [PATCH 02/10] edits --- .../data-ingestion/ingest-from-query.md | 24 ++++++++++++------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md index 2a189d7cb9..aaeba7c2cb 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md @@ -13,15 +13,21 @@ These commands execute a query or a management command and ingest the results of |Command |If table exists |If table doesn't exist | |-----------------|------------------------------------|------------------------------------------| -|`.set` |The command fails |The table is created and data is ingested| -|`.append` |Data is appended to the table |The command fails | -|`.set-or-append` |Data is appended to the table |The table is created and data is ingested| -|`.set-or-replace`|Data replaces the data in the table|The table is created and data is ingested| +|`.set` |The command fails. |The table is created and data is ingested.| +|`.append` |Data is appended to the table. |The command fails.| +|`.set-or-append` |Data is appended to the table. |The table is created and data is ingested.| +|`.set-or-replace`|Data replaces the data in the table.|The table is created and data is ingested.| To cancel an ingest from query command, see [`cancel operation`](../cancel-operation-command.md). +::: moniker range="azure-data-explorer" > [!NOTE] > Ingest from query is a [direct ingestion](/azure/data-explorer/ingest-data-overview#direct-ingestion-with-management-commands). As such, it does not include automatic retries. Automatic retries are available when ingesting through the data management service. Use the [ingestion overview](/azure/data-explorer/ingest-data-overview) document to decide which is the most suitable ingestion option for your scenario. +::: moniker-end +::: moniker range="microsoft-fabric" +> [!NOTE] +> Ingest from query is a [direct ingestion](/azure/data-explorer/ingest-data-overview#direct-ingestion-with-management-commands). As such, it does not include automatic retries. Automatic retries are available when ingesting through the data management service. +::: moniker-end ## Permissions @@ -50,22 +56,22 @@ For more information on permissions, see [Kusto role-based access control](../.. ## Performance tips -* Set the `distributed` property to `true` if the amount of data produced by the query is large, exceeds 1 GB, and doesn't require serialization. Then, multiple nodes can produce output in parallel. Don't use this flag when query results are small, since it might needlessly generate many small data shards. +* Set the `distributed` property to `true` if the amount of data produced by the query is large, exceeds one gigabyte (GB), and doesn't require serialization. Then, multiple nodes can produce output in parallel. Don't use this flag when query results are small, since it might needlessly generate many small data shards. * Data ingestion is a resource-intensive operation that might affect concurrent activities on the database, including running queries. Avoid running too many ingestion commands at the same time. -* Limit the data for ingestion to less than 1 GB per ingestion operation. If necessary, use multiple ingestion commands. +* Limit the data for ingestion to less than one GB per ingestion operation. If necessary, use multiple ingestion commands. ## Supported ingestion properties |Property|Type|Description| |--|--|--| |`distributed` | `bool` | If `true`, the command ingests from all nodes executing the query in parallel. Default is `false`. See [performance tips](#performance-tips).| -|`creationTime` | `string` | The datetime value, formatted as an ISO8601 string, to use at the creation time of the ingested data extents. If unspecified, `now()` is used. When specified, make sure the `Lookback` property in the target table's effective [Extents merge policy](../merge-policy.md) is aligned with the specified value.| +|`creationTime` | `string` | The `datetime` value, formatted as an ISO8601 `string`, to use at the creation time of the ingested data extents. If unspecified, `now()` is used. When specified, make sure the `Lookback` property in the target table's effective [Extents merge policy](../merge-policy.md) is aligned with the specified value.| |`extend_schema` | `bool` | If `true`, the command might extend the schema of the table. Default is `false`. This option applies only to `.append`, `.set-or-append`, and `set-or-replace` commands. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.| |`recreate_schema` | `bool` | If `true`, the command might recreate the schema of the table. Default is `false`. This option applies only to the `.set-or-replace` command. This option takes precedence over the `extend_schema` property if both are set. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.| |`folder` | `string` | The folder to assign to the table. If the table already exists, this property overwrites the table's folder.| |`ingestIfNotExists` | `string` | If specified, ingestion fails if the table already has data tagged with an `ingest-by:` tag with the same value. For more information, see [ingest-by: tags](../extent-tags.md).| |`policy_ingestiontime` | `bool` | If `true`, the [Ingestion Time Policy](../show-table-ingestion-time-policy-command.md) is enabled on the table. The default is `true`.| -|`tags` | `string` | A JSON string that represents a list of [tags](../extent-tags.md) to associate with the created extent. | +|`tags` | `string` | A JSON `string` that represents a list of [tags](../extent-tags.md) to associate with the created extent. | |`docstring` | `string` | A description used to document the table.| |`persistDetails` |A Boolean value that, if specified, indicates that the command should persist the detailed results for retrieval by the [.show operation details](../show-operations.md) command. Defaults to `false`. |`with (persistDetails=true)`| @@ -106,7 +112,7 @@ The following query creates the :::no-loc text=`RecentErrors`::: table with the ### Create and update table from query source using the *distributed* flag -The following example creates a new table called `OldExtents` in the database asynchronously. The dataset is expected to be bigger than one GB (more than ~1 million rows) so the *distributed* flag is used. It updates `OldExtents` with data with `ExtentId` entries from the `MyExtents` table that were created more than 30 days ago. +The following example creates a new table called `OldExtents` in the database asynchronously. The dataset is expected to be bigger than one GB (more than ~one million rows) so the *distributed* flag is used. It updates `OldExtents` with data with `ExtentId` entries from the `MyExtents` table that were created more than 30 days ago. ```kusto .set async OldExtents with(distributed=true) <| From 52455b0bb7fae3004a44bf03f4fa8fda666e57c1 Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Sun, 24 Nov 2024 17:44:43 +0200 Subject: [PATCH 03/10] edits --- .../kusto/management/data-ingestion/ingest-from-query.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md index aaeba7c2cb..05c4fe8367 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md @@ -102,7 +102,7 @@ Returns information on the extents created because of the `.set` or `.append` co ### Create and update table from query source -The following query creates the :::no-loc text=`RecentErrors`::: table with the same schema as :::no-loc text=`LogsTable`:::. It updates :::no-loc text=`RecentErrors`::: with all error logs from :::no-loc text=`LogsTable`::: over the last hour. +The following query creates the :::no-loc text="`RecentErrors`"::: table with the same schema as :::no-loc text="`LogsTable`":::. It updates :::no-loc text="`RecentErrors`"::: with all error logs from :::no-loc text="`LogsTable`"::: over the last hour. ```kusto .set RecentErrors <| @@ -112,7 +112,7 @@ The following query creates the :::no-loc text=`RecentErrors`::: table with the ### Create and update table from query source using the *distributed* flag -The following example creates a new table called `OldExtents` in the database asynchronously. The dataset is expected to be bigger than one GB (more than ~one million rows) so the *distributed* flag is used. It updates `OldExtents` with data with `ExtentId` entries from the `MyExtents` table that were created more than 30 days ago. +The following example creates a new table called `OldExtents` in the database, asynchronously. The dataset is expected to be bigger than one GB (more than ~one million rows) so the *distributed* flag is used. It updates `OldExtents` with `ExtentId` entries from the `MyExtents` table that were created more than 30 days ago. ```kusto .set async OldExtents with(distributed=true) <| @@ -167,6 +167,8 @@ The following example appends data to the `OldExtents` table asynchronously, usi **Sample output** +The following is a sample of the type of output you may see from your queries. + |ExtentId |OriginalSize |ExtentSize |CompressedSize |IndexSize |RowCount | |--|--|--|--|--|--| |23a05ed6-376d-4119-b1fc-6493bcb05563 |1291 |5882 |1568 |4314 |10 | From c0383aa9c5c095ac8c6966eaf170e47c3571e3c4 Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Mon, 25 Nov 2024 13:42:53 +0200 Subject: [PATCH 04/10] MI and edits --- .../data-ingestion/ingest-from-storage.md | 32 +++++++++++-------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md b/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md index d97974ff11..78c41326b3 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md @@ -3,7 +3,7 @@ title: Kusto.ingest into command (pull data from storage) description: This article describes The .ingest into command (pull data from storage). ms.reviewer: orspodek ms.topic: reference -ms.date: 08/11/2024 +ms.date: 11/25/2024 --- # Ingest from storage @@ -12,7 +12,7 @@ ms.date: 08/11/2024 The `.ingest into` command ingests data into a table by "pulling" the data from one or more cloud storage files. For example, the command -can retrieve 1000 CSV-formatted blobs from Azure Blob Storage, parse +can retrieve 1,000 CSV-formatted blobs from Azure Blob Storage, parse them, and ingest them together into a single target table. Data is appended to the table without affecting existing records, and without modifying the table's schema. @@ -44,7 +44,7 @@ You must have at least [Table Ingestor](../../access-control/role-based-access-c ## Authentication and authorization -Each storage connection string indicates the authorization method to use for access to the storage. Depending on the authorization method, the principal may need to be granted permissions on the external storage to perform the ingestion. +Each storage connection string indicates the authorization method to use for access to the storage. Depending on the authorization method, the principal might need to be granted permissions on the external storage to perform the ingestion. The following table lists the supported authentication methods and the permissions needed for ingesting data from external storage. @@ -58,17 +58,15 @@ The following table lists the supported authentication methods and the permissio ## Returns -The result of the command is a table with as many records -as there are data shards ("extents") generated by the command. -If no data shards have been generated, a single record is returned -with an empty (zero-valued) extent ID. +The result of the command is a table with as many records as there are data shards ("extents") generated by the command. +If no data shards were generated, a single record is returned with an empty (zero-valued) extent ID. -|Name |Type |Description | -|-----------|----------|---------------------------------------------------------------------------| +|Name |Type |Description | +|-----------|----------|-----------------------------------------------------------| |ExtentId |`guid` |The unique identifier for the data shard that was generated by the command.| -|ItemLoaded |`string` |One or more storage files that are related to this record. | +|ItemLoaded |`string` |One or more storage files that are related to this record.| |Duration |`timespan`|How long it took to perform ingestion. | -|HasErrors |`bool` |Whether this record represents an ingestion failure or not. | +|HasErrors |`bool` |Whether or not this record represents an ingestion failure.| |OperationId|`guid` |A unique ID representing the operation. Can be used with the `.show operation` command.| >[!NOTE] @@ -78,7 +76,7 @@ with an empty (zero-valued) extent ID. ### Azure Blob Storage with shared access signature -The following example instructs your database to read two blobs from Azure Blob Storage as CSV files, and ingest their contents into table `T`. The `...` represents an Azure Storage shared access signature (SAS) which gives read access to each blob. Note also the use of obfuscated strings (the `h` in front of the string values) to ensure that the SAS is never recorded. +The following example instructs your database to read two blobs from Azure Blob Storage as CSV files, and ingest their contents into table `T`. The `...` represents an Azure Storage shared access signature (SAS) which gives read access to each blob. Obfuscated strings (the `h` in front of the string values) are used to ensure that the SAS is never recorded. ```kusto .ingest into table T ( @@ -89,7 +87,7 @@ The following example instructs your database to read two blobs from Azure Blob ### Azure Blob Storage with managed identity -The following example shows how to read a CSV file from Azure Blob Storage and ingest its contents into table `T` using managed identity authentication. For additional information on managed identity authentication method, see [Managed Identity Authentication Overview](../../api/connection-strings/storage-connection-strings.md#managed-identity). +The following example shows how to read a CSV file from Azure Blob Storage and ingest its contents into table `T` using managed identity authentication. Authentication uses the managed identity ID (object ID) set for the Azure Blob Storage in Azure. For more information, see [Create a managed identity for storage containers](/azure/ai-services/language-service/native-document-support/managed-identities). ```kusto .ingest into table T ('https://StorageAccount.blob.core.windows.net/Container/file.csv;managed_identity=802bada6-4d21-44b2-9d15-e66b29e4d63e') @@ -137,3 +135,11 @@ The following example ingests a single file from Amazon S3 using a [preSigned UR .ingest into table T ('https://bucketname.s3.us-east-1.amazonaws.com/file.csv?<
>') with (format='csv') ``` + +## Related content + +* [Data formats supported for ingestion](../../ingestion-supported-formats.md) +* [Inline ingestion](ingest-inline.md) +* [Ingest from query (.set, .append, .set-or-append, .set-or-replace)](ingest-from-query.md) +* [.show ingestion failures command](../ingestion-failures.md) +* [.show ingestion mapping](../show-ingestion-mapping-command.md) From 8d2aeffbc79227417471f6c6503c0f56a11df3e5 Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Mon, 25 Nov 2024 16:41:01 +0200 Subject: [PATCH 05/10] edits --- .../kusto/management/data-ingestion/ingest-from-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md b/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md index 78c41326b3..b68750e1e0 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-storage.md @@ -87,7 +87,7 @@ The following example instructs your database to read two blobs from Azure Blob ### Azure Blob Storage with managed identity -The following example shows how to read a CSV file from Azure Blob Storage and ingest its contents into table `T` using managed identity authentication. Authentication uses the managed identity ID (object ID) set for the Azure Blob Storage in Azure. For more information, see [Create a managed identity for storage containers](/azure/ai-services/language-service/native-document-support/managed-identities). +The following example shows how to read a CSV file from Azure Blob Storage and ingest its contents into table `T` using managed identity authentication. Authentication uses the managed identity ID (object ID) assigned to the Azure Blob Storage in Azure. For more information, see [Create a managed identity for storage containers](/azure/ai-services/language-service/native-document-support/managed-identities). ```kusto .ingest into table T ('https://StorageAccount.blob.core.windows.net/Container/file.csv;managed_identity=802bada6-4d21-44b2-9d15-e66b29e4d63e') From ab1593ba888458ce45111d281d401fc85256c98d Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Mon, 25 Nov 2024 17:45:26 +0200 Subject: [PATCH 06/10] edits --- .../kusto/management/data-ingestion/ingest-from-query.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md index 05c4fe8367..d36159d74c 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md @@ -102,7 +102,7 @@ Returns information on the extents created because of the `.set` or `.append` co ### Create and update table from query source -The following query creates the :::no-loc text="`RecentErrors`"::: table with the same schema as :::no-loc text="`LogsTable`":::. It updates :::no-loc text="`RecentErrors`"::: with all error logs from :::no-loc text="`LogsTable`"::: over the last hour. +The following query creates the :::no-loc text="\`RecentErrors\`"::: table with the same schema as :::no-loc text="\`LogsTable\`":::. It updates :::no-loc text="\`RecentErrors\`"::: with all error logs from :::no-loc text="\`LogsTable\`"::: over the last hour. ```kusto .set RecentErrors <| From af7b4881bc9f73af76af764bd9f5ff4bfa4ab65b Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Mon, 25 Nov 2024 17:53:27 +0200 Subject: [PATCH 07/10] edits --- .../kusto/management/data-ingestion/ingest-from-query.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md index d36159d74c..37ae790100 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md @@ -102,7 +102,7 @@ Returns information on the extents created because of the `.set` or `.append` co ### Create and update table from query source -The following query creates the :::no-loc text="\`RecentErrors\`"::: table with the same schema as :::no-loc text="\`LogsTable\`":::. It updates :::no-loc text="\`RecentErrors\`"::: with all error logs from :::no-loc text="\`LogsTable\`"::: over the last hour. +The following query creates the `:::no-loc text="RecentErrors":::` table with the same schema as `:::no-loc text="LogsTable":::`. It updates `:::no-loc text="RecentErrors":::` with all error logs from `:::no-loc text="LogsTable":::` over the last hour. ```kusto .set RecentErrors <| From 6b5bd05c391a555125a72a4aa340dbbc07483fe5 Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Mon, 25 Nov 2024 18:03:57 +0200 Subject: [PATCH 08/10] edits --- .../kusto/management/data-ingestion/ingest-from-query.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md index 37ae790100..48eebab060 100644 --- a/data-explorer/kusto/management/data-ingestion/ingest-from-query.md +++ b/data-explorer/kusto/management/data-ingestion/ingest-from-query.md @@ -102,7 +102,7 @@ Returns information on the extents created because of the `.set` or `.append` co ### Create and update table from query source -The following query creates the `:::no-loc text="RecentErrors":::` table with the same schema as `:::no-loc text="LogsTable":::`. It updates `:::no-loc text="RecentErrors":::` with all error logs from `:::no-loc text="LogsTable":::` over the last hour. +The following query creates the *:::no-loc text="RecentErrors":::* table with the same schema as *:::no-loc text="LogsTable":::*. It updates *:::no-loc text="RecentErrors":::* with all error logs from *:::no-loc text="LogsTable":::* over the last hour. ```kusto .set RecentErrors <| From 0e44fff9567802a58f0ec9d8c5855e2e22ef61a1 Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Mon, 25 Nov 2024 20:01:44 +0200 Subject: [PATCH 09/10] negate, edits --- data-explorer/kusto/query/regex.md | 101 +++++++++++++++-------------- 1 file changed, 51 insertions(+), 50 deletions(-) diff --git a/data-explorer/kusto/query/regex.md b/data-explorer/kusto/query/regex.md index 77e99bb6bb..c760f8c692 100644 --- a/data-explorer/kusto/query/regex.md +++ b/data-explorer/kusto/query/regex.md @@ -3,7 +3,7 @@ title: Regex syntax description: Learn about the regular expression syntax supported by Kusto Query Language (KQL). ms.reviewer: alexans ms.topic: reference -ms.date: 09/01/2024 +ms.date: 11/25/2024 --- # Regex syntax @@ -21,33 +21,34 @@ The following sections document the regular expression syntax supported by Kusto ### Match one character -| Pattern | Description | +| Pattern | Description | |-------------|-----------------------------------------------------------------| -| `.` | Any character except new line (includes new line with s flag) | -| `[0-9]` | Any ASCII digit | -| `\d` | Digit (`\p{Nd}`) | -| `\D` | Not a digit | -| `\pX` | Unicode character class identified by a one-letter name | -| `\p{Greek}` | Unicode character class (general category or script) | -| `\PX` | Negated Unicode character class identified by a one-letter name | -| `\P{Greek}` | Negated Unicode character class (general category or script) | +| `.` | Any character except new line (includes new line with s flag).| +| `[0-9]` | Any ASCII digit.| +| `[^0-9]` | Any character that isn't an ASCII digit. | +| `\d` | Digit (`\p{Nd}`). | +| `\D` | Not a digit.| +| `\pX` | Unicode character class identified by a one-letter name.| +| `\p{Greek}` | Unicode character class (general category or script).| +| `\PX` | Negated Unicode character class identified by a one-letter name.| +| `\P{Greek}` | Negated Unicode character class (general category or script). | ### Character classes | Pattern | Description | |----------------|-------------------------------------------------------------------------| | `[xyz]` | Character class matching either x, y or z (union). | -| `[^xyz]` | Character class matching any character except x, y and z. | +| `[^xyz]` | Character class matching any character except x, y, and z. | | `[a-z]` | Character class matching any character in range a-z. | -| `[[:alpha:]]` | ASCII character class ([A-Za-z]) | -| `[[:^alpha:]]` | Negated ASCII character class ([^A-Za-z]) | -| `[x[^xyz]]` | Nested/grouping character class (matching any character except y and z) | -| `[a-y&&xyz]` | Intersection (matching x or y) | -| `[0-9&&[^4]]` | Subtraction using intersection and negation (matching 0-9 except 4) | -| `[0-9--4]` | Direct subtraction (matching 0-9 except 4) | -| `[a-g~~b-h]` | Symmetric difference (matching `a` and `h` only) | -| `[\[\]]` | Escape in character classes (matching [ or ]) | -| `[a&&b]` | Empty character class matching nothing | +| `[[:alpha:]]` | ASCII character class ([A-Za-z]). | +| `[[:^alpha:]]` | Negated ASCII character class ([^A-Za-z]). | +| `[x[^xyz]]` | Nested/grouping character class (matching any character except y and z). | +| `[a-y&&xyz]` | Intersection (matching x or y). | +| `[0-9&&[^4]]` | Subtraction using intersection and negation (matching 0-9 except 4). | +| `[0-9--4]` | Direct subtraction (matching 0-9 except 4). | +| `[a-g~~b-h]` | Symmetric difference (matching `a` and `h` only). | +| `[\[\]]` | Escape in character classes (matching [ or ]). | +| `[a&&b]` | Empty character class matching nothing. | > [!NOTE] > Any named character class may appear inside a bracketed `[...]` character class. For example, `[\p{Greek}[:digit:]]` matches any ASCII digit or any codepoint in the `Greek` script. `[\p{Greek}&&\pL]` matches Greek letters. @@ -87,27 +88,27 @@ Precedence in character classes is from most binding to least binding: | Pattern | Description | |------------------ |---------------------------------------------------------------------------| -| `^` | Beginning of a haystack (or start-of-line with multi-line mode) | -| `$` | End of a haystack (or end-of-line with multi-line mode) | -| `\A` | Only the beginning of a haystack (even with multi-line mode enabled) | -| `\z` | Only the end of a haystack (even with multi-line mode enabled) | -| `\b` | Unicode word boundary (`\w` on one side and `\W`, `\A`, or `\z` on other) | -| `\B` | Not a Unicode word boundary | -| `\b{start}`, `\<` | Unicode start-of-word boundary (`\W\|\A` on the left, `\w` on the right) | -| `\b{end}`, `\>` | Unicode end-of-word boundary (`\w` on the left, `\W\|\z` on the right) | -| `\b{start-half}` | Half of a Unicode start-of-word boundary (`\W\|\A` on the left) | -| `\b{end-half}` | Half of a Unicode end-of-word boundary (`\W\|\z` on the right) | +| `^` | Beginning of a haystack or start-of-line with multi-line mode. | +| `$` | End of a haystack or end-of-line with multi-line mode. | +| `\A` | Only the beginning of a haystack, even with multi-line mode enabled. | +| `\z` | Only the end of a haystack, even with multi-line mode enabled. | +| `\b` | Unicode word boundary with `\w` on one side and `\W`, `\A`, or `\z` on other. | +| `\B` | Not a Unicode word boundary. | +| `\b{start}`, `\<` | Unicode start-of-word boundary with `\W\|\A` at the start of the string and `\w` on the other side. | +| `\b{end}`, `\>` | Unicode end-of-word boundary with `\w` on one side and `\W\|\z` at the end. | +| `\b{start-half}` | Half of a Unicode start-of-word boundary with `\W\|\A` at the beginning of the boundary. | +| `\b{end-half}` | Half of a Unicode end-of-word boundary with `\W\|\z` at the end. | ### Grouping and flags | Pattern | Description | |------------------|-------------------------------------------------------------------| -| `(exp)` | Numbered capture group (indexed by opening parenthesis) | -| `(?Pexp)` | Named (also numbered) capture group (names must be alpha-numeric) | -| `(? exp)` | Named (also numbered) capture group (names must be alpha-numeric) | -| `(?:exp)` | Non-capturing group | -| `(?flags)` | Set flags within current group | -| `(?flags:exp)` | Set flags for exp (non-capturing) | +| `(exp)` | Numbered capture group (indexed by opening parenthesis). | +| `(?P exp)` | Named capture group (names must be alpha-numeric). | +| `(? exp)` | Named capture group (names must be alpha-numeric). | +| `(?:exp)` | Non-capturing group. | +| `(?flags)` | Set flags within current group. | +| `(?flags:exp)` | Set flags for exp (non-capturing). | Capture group names can contain only alpha-numeric Unicode codepoints, dots `.`, underscores `_`, and square brackets`[` and `]`. Names must start with either an `_` or an alphabetic codepoint. Alphabetic codepoints correspond to the `Alphabetic` Unicode property, while numeric codepoints correspond to the union of the `Decimal_Number`, `Letter_Number` and `Other_Number` general categories. @@ -115,17 +116,17 @@ Flags are single characters. For example, `(?x)` sets the flag `x` and `(?-x)` c -| Flag | Description | +| Flag | Description | |---------|-------------------------------------------------------------------------------| -| `i` | Case-insensitive: letters match both upper and lower case | -| `m` | Multi-line mode: `^` and `$` match begin/end of line | -| `s` | Allow dot (.). to match `\n` | -| `R` | Enables CRLF mode: when multi-line mode is enabled, `\r\n` is used | -| `U` | Swap the meaning of `x*` and `x*?` | -| `u` | Unicode support (enabled by default) | -| `x` | Verbose mode, ignores whitespace and allow line comments (starting with `#`) | +| `i` | Case-insensitive: letters match both upper and lower case. | +| `m` | Multi-line mode: `^` and `$` match begin/end of line. | +| `s` | Allow dot (.). to match `\n`. | +| `R` | Enables CRLF mode: when multi-line mode is enabled, `\r\n` is used. | +| `U` | Swap the meaning of `x*` and `x*?`. | +| `u` | Unicode support (enabled by default).| +| `x` | Verbose mode, ignores whitespace and allow line comments (starting with `#`). | -Note that in verbose mode, whitespace is ignored everywhere, including within character classes. To insert whitespace, use its escaped form or a hex literal. For example, `\ ` or `\x20` for an ASCII space. +In verbose mode, whitespace is ignored everywhere, including within character classes. To insert whitespace, use its escaped form or a hex literal. For example, `\ ` or `\x20` for an ASCII space. > [!NOTE] > @@ -204,13 +205,13 @@ These classes are based on the definitions provided in [UTS#18](https://www.unic This section provides some guidance on speed and resource usage of regex expressions. -### Unicode can impact memory usage and search speed +### Unicode can affect memory usage and search speed -KQL regex provides first class support for Unicode. In many cases, the extra memory required to support Unicode is negligible and won't typically impact search speed. +KQL regex provides first class support for Unicode. In many cases, the extra memory required to support Unicode is negligible and doesn't typically affect search speed. -The following are some examples of Unicode character classes that may impact memory usage and search speed: +The following are some examples of Unicode character classes that can affect memory usage and search speed: -* **Memory usage**: The impact of Unicode primarily arises from the use of Unicode character classes. Unicode character classes tend to be larger in size. For example, the `\w` character class matches around 140,000 distinct codepoints by default. This requires additional memory and can slow down regex compilation. If your requirements can be satisfied by ASCII, it is recommended to use ASCII classes instead of Unicode classes. The ASCII-only version of `\w` can be expressed in multiple ways, all of which are equivalent. +* **Memory usage**: The effect of Unicode primarily arises from the use of Unicode character classes. Unicode character classes tend to be larger in size. For example, the `\w` character class matches around 140,000 distinct codepoints by default. This requires more memory and can slow down regex compilation. If ASCII satisfies your requirements, use ASCII classes instead of Unicode classes. The ASCII-only version of `\w` can be expressed in multiple ways, all of which are equivalent. ``` [0-9A-Za-z_] @@ -219,7 +220,7 @@ The following are some examples of Unicode character classes that may impact mem [\w&&\p{ascii}] ``` -* **Search speed**: Unicode tends to be handled pretty well, even when using large Unicode character classes. However, some of the faster internal regex engines cannot handle a Unicode aware word boundary assertion. So if you don't need Unicode-aware word boundary assertions, you might consider using `(?-u:\b)` instead of `\b`. The `(?-u:\b)` uses an ASCII-only definition of a word character, which can improve search speed. +* **Search speed**: Unicode tends to be handled well, even when using large Unicode character classes. However, some of the faster internal regex engines can't handle a Unicode aware word boundary assertion. So if you don't need Unicode-aware word boundary assertions, you might consider using `(?-u:\b)` instead of `\b`. The `(?-u:\b)` uses an ASCII-only definition of a word character, which can improve search speed. ### Literals can accelerate searches From 1ed98c8dbc15750f42c9afd4a1e56fe3d4e4dc02 Mon Sep 17 00:00:00 2001 From: Meira Josephy <144697924+mjosephym@users.noreply.github.com> Date: Wed, 27 Nov 2024 11:00:00 +0200 Subject: [PATCH 10/10] monitor, edits --- .../query/visualization-stackedareachart.md | 48 +++++++++++-------- 1 file changed, 29 insertions(+), 19 deletions(-) diff --git a/data-explorer/kusto/query/visualization-stackedareachart.md b/data-explorer/kusto/query/visualization-stackedareachart.md index 6dc8cf7764..6830d19cf9 100644 --- a/data-explorer/kusto/query/visualization-stackedareachart.md +++ b/data-explorer/kusto/query/visualization-stackedareachart.md @@ -3,12 +3,12 @@ title: Stacked area chart visualization description: This article describes the stacked area chart visualization. ms.reviewer: alexans ms.topic: reference -ms.date: 08/11/2024 +ms.date: 11/27/2024 monikerRange: "microsoft-fabric || azure-data-explorer" --- # Stacked area chart -> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] +> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] The stacked area chart visual shows a continuous relationship. This visual is similar to the [Area chart](visualization-areachart.md), but shows the area under each element of a series. The first column of the query should be numeric and is used as the x-axis. Other numeric columns are the y-axes. Unlike line charts, area charts also visually represent volume. Area charts are ideal for indicating the change among different datasets. @@ -24,31 +24,33 @@ The stacked area chart visual shows a continuous relationship. This visual is si ## Supported parameters | Name | Type | Required | Description | -| -- | -- | -- | -- | -| *T* | `string` | :heavy_check_mark: | Input table name.| -| *propertyName*, *propertyValue* | `string` | | A comma-separated list of key-value property pairs. See [supported properties](#supported-properties).| +|--|--|--|--| +| *T* | `string` | :heavy_check_mark: | Input table name. | +| *propertyName*, *propertyValue* | `string` | | A comma-separated list of key-value property pairs. See [supported properties](#supported-properties). | ### Supported properties All properties are optional. -|*PropertyName*|*PropertyValue* | -|--------------|----------------------------------------------------------------------------------| -|`accumulate` |Whether the value of each measure gets added to all its predecessors. (`true` or `false`)| -|`legend` |Whether to display a legend or not (`visible` or `hidden`). | -|`series` |Comma-delimited list of columns whose combined per-record values define the series that record belongs to.| -|`ymin` |The minimum value to be displayed on Y-axis. | -|`ymax` |The maximum value to be displayed on Y-axis. | -|`title` |The title of the visualization (of type `string`). | -|`xaxis` |How to scale the x-axis (`linear` or `log`). | -|`xcolumn` |Which column in the result is used for the x-axis. | -|`xtitle` |The title of the x-axis (of type `string`). | -|`yaxis` |How to scale the y-axis (`linear` or `log`). | -|`ycolumns` |Comma-delimited list of columns that consist of the values provided per value of the x column.| -|`ytitle` |The title of the y-axis (of type `string`). | +| *PropertyName* | *PropertyValue* | +|--|--| +| `accumulate` | Whether the value of each measure gets added to all its predecessors. (`true` or `false`) | +| `legend` | Whether to display a legend or not (`visible` or `hidden`). | +| `series` | Comma-delimited list of columns whose combined per-record values define the series that record belongs to. | +| `ymin` | The minimum value to be displayed on Y-axis. | +| `ymax` | The maximum value to be displayed on Y-axis. | +| `title` | The title of the visualization (of type `string`). | +| `xaxis` | How to scale the x-axis (`linear` or `log`). | +| `xcolumn` | Which column in the result is used for the x-axis. | +| `xtitle` | The title of the x-axis (of type `string`). | +| `yaxis` | How to scale the y-axis (`linear` or `log`). | +| `ycolumns` | Comma-delimited list of columns that consist of the values provided per value of the x column. | +| `ytitle` | The title of the y-axis (of type `string`). | ## Example +The following query summarizes data from the `nyc_taxi` table by number of passengers and visualizes the data in a stacked area chart. The x-axis shows the pickup time in two day intervals, and the stacked areas represent different passenger counts. + :::moniker range="azure-data-explorer" > [!div class="nextstepaction"] > Run the query @@ -60,4 +62,12 @@ nyc_taxi | render stackedareachart with (xcolumn=pickup_datetime, series=passenger_count) ``` +**Output** + :::image type="content" source="media/visualization-stacked-areachart/stacked-area-chart.png" alt-text="Screenshot of stacked area chart visual output." lightbox="media/visualization-stacked-areachart/stacked-area-chart.png"::: + +## Related content + +* [render operator](render-operator.md) +* [bin()](bin-function.md) +* [summarize operator](summarize-operator.md) \ No newline at end of file