Skip to content

Commit

Permalink
Merge pull request #2401 from MicrosoftDocs/main638628590923458725syn…
Browse files Browse the repository at this point in the history
…c_temp

For protected branch, push strategy should use PR and merge to target branch method to work around git push error
  • Loading branch information
learn-build-service-prod[bot] authored Sep 25, 2024
2 parents 7636ff8 + d7c25fe commit 3bbb53f
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 29 deletions.
27 changes: 13 additions & 14 deletions data-explorer/kusto/functions-library/time-weighted-val-fl.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The function `time_weighted_val_fl()` is a [user-defined function (UDF)](../quer

## Syntax

`T | invoke time_weighted_avg_fl(`*t_col*`,` *y_col*`,` *key_col*`,` *stime*`,` *etime*`,` *dt*`,` *keep_orig*`)`
`T | invoke time_weighted_avg_fl(`*t_col*`,` *y_col*`,` *key_col*`,` *stime*`,` *etime*`,` *dt*`)`

[!INCLUDE [syntax-conventions-note](../includes/syntax-conventions-note.md)]

Expand All @@ -28,7 +28,6 @@ The function `time_weighted_val_fl()` is a [user-defined function (UDF)](../quer
| *stime* | `datetime` | :heavy_check_mark: | The start time of the aggregation window.|
| *etime* | `datetime` | :heavy_check_mark: | The end time of the aggregation window.|
| *dt* | `timespan` | :heavy_check_mark: | The aggregation time bin.|
| *keep_orig* | `bool` | | Keep original samples. Default is false.|

## Function definition

Expand All @@ -42,15 +41,15 @@ Define the function using the following [let statement](../query/let-statement.m
> A [let statement](../query/let-statement.md) can't run on its own. It must be followed by a [tabular expression statement](../query/tabular-expression-statements.md). To run a working example of `time_weighted_avg_fl()`, see [Example](#example).
```kusto
let time_weighted_val_fl=(tbl:(*), t_col:string, y_col:string, key_col:string, stime:datetime, etime:datetime, dt:timespan, keep_orig:bool=false)
let time_weighted_val_fl=(tbl:(*), t_col:string, y_col:string, key_col:string, stime:datetime, etime:datetime, dt:timespan)
{
let tbl_ex = tbl | extend _ts = column_ifexists(t_col, datetime(null)), _val = column_ifexists(y_col, 0.0), _key = column_ifexists(key_col, '');
let gridTimes = range _ts from stime to etime step dt | extend _val=real(null), grid=1, dummy=1;
let keys = materialize(tbl_ex | summarize by _key | extend dummy=1);
gridTimes
| join kind=fullouter keys on dummy
| project-away dummy, dummy1
| union tbl_ex
| union (tbl_ex | extend grid=0)
| where _ts between (stime..etime)
| partition hint.strategy=native by _key (
order by _ts desc, _val nulls last
Expand All @@ -64,8 +63,8 @@ let time_weighted_val_fl=(tbl:(*), t_col:string, y_col:string, key_col:string, s
| extend _twa_val=iff(dt0+dt1 == 0, _val, ((val0*dt1)+(val1*dt0))/(dt0+dt1))
| scan with ( // fill forward null twa values
step s: true => _twa_val=iff(isnull(_twa_val), s._twa_val, _twa_val);)
| where grid == 0 or (grid == 1 and _ts != prev(_ts))
)
| where grid == 1 or (keep_orig and _ts != next(_ts))
| project _ts, _key, _twa_val, orig_val=iff(grid == 1, 0, 1)
| order by _key asc, _ts asc
};
Expand All @@ -81,15 +80,15 @@ Define the stored function once using the following [`.create function`](../mana
```kusto
.create-or-alter function with (folder = "Packages\\Series", docstring = "Linear interpolation of metric value by time weighted average")
time_weighted_val_fl(tbl:(*), t_col:string, y_col:string, key_col:string, stime:datetime, etime:datetime, dt:timespan, keep_orig:bool=false)
time_weighted_val_fl(tbl:(*), t_col:string, y_col:string, key_col:string, stime:datetime, etime:datetime, dt:timespan)
{
let tbl_ex = tbl | extend _ts = column_ifexists(t_col, datetime(null)), _val = column_ifexists(y_col, 0.0), _key = column_ifexists(key_col, '');
let gridTimes = range _ts from stime to etime step dt | extend _val=real(null), grid=1, dummy=1;
let keys = materialize(tbl_ex | summarize by _key | extend dummy=1);
gridTimes
| join kind=fullouter keys on dummy
| project-away dummy, dummy1
| union tbl_ex
| union (tbl_ex | extend grid=0)
| where _ts between (stime..etime)
| partition hint.strategy=native by _key (
order by _ts desc, _val nulls last
Expand All @@ -103,8 +102,8 @@ time_weighted_val_fl(tbl:(*), t_col:string, y_col:string, key_col:string, stime:
| extend _twa_val=iff(dt0+dt1 == 0, _val, ((val0*dt1)+(val1*dt0))/(dt0+dt1))
| scan with ( // fill forward null twa values
step s: true => _twa_val=iff(isnull(_twa_val), s._twa_val, _twa_val);)
| where grid == 0 or (grid == 1 and _ts != prev(_ts))
)
| where grid == 1 or (keep_orig and _ts != next(_ts))
| project _ts, _key, _twa_val, orig_val=iff(grid == 1, 0, 1)
| order by _key asc, _ts asc
}
Expand All @@ -121,15 +120,15 @@ The following example uses the [invoke operator](../query/invoke-operator.md) to
To use a query-defined function, invoke it after the embedded function definition.

```kusto
let time_weighted_val_fl=(tbl:(*), t_col:string, y_col:string, key_col:string, stime:datetime, etime:datetime, dt:timespan, keep_orig:bool=false)
let time_weighted_val_fl=(tbl:(*), t_col:string, y_col:string, key_col:string, stime:datetime, etime:datetime, dt:timespan)
{
let tbl_ex = tbl | extend _ts = column_ifexists(t_col, datetime(null)), _val = column_ifexists(y_col, 0.0), _key = column_ifexists(key_col, '');
let gridTimes = range _ts from stime to etime step dt | extend _val=real(null), grid=1, dummy=1;
let keys = materialize(tbl_ex | summarize by _key | extend dummy=1);
gridTimes
| join kind=fullouter keys on dummy
| project-away dummy, dummy1
| union tbl_ex
| union (tbl_ex | extend grid=0)
| where _ts between (stime..etime)
| partition hint.strategy=native by _key (
order by _ts desc, _val nulls last
Expand All @@ -143,8 +142,8 @@ let time_weighted_val_fl=(tbl:(*), t_col:string, y_col:string, key_col:string, s
| extend _twa_val=iff(dt0+dt1 == 0, _val, ((val0*dt1)+(val1*dt0))/(dt0+dt1))
| scan with ( // fill forward null twa values
step s: true => _twa_val=iff(isnull(_twa_val), s._twa_val, _twa_val);)
| where grid == 0 or (grid == 1 and _ts != prev(_ts))
)
| where grid == 1 or (keep_orig and _ts != next(_ts))
| project _ts, _key, _twa_val, orig_val=iff(grid == 1, 0, 1)
| order by _key asc, _ts asc
};
Expand All @@ -162,7 +161,7 @@ let stime=toscalar(minmax | project mint);
let etime=toscalar(minmax | project maxt);
let dt = 1h;
tbl
| invoke time_weighted_val_fl('ts', 'val', 'key', stime, etime, dt, keep_orig=true)
| invoke time_weighted_val_fl('ts', 'val', 'key', stime, etime, dt)
| project-rename val = _twa_val
| order by _key asc, _ts asc
```
Expand All @@ -187,7 +186,7 @@ let stime=toscalar(minmax | project mint);
let etime=toscalar(minmax | project maxt);
let dt = 1h;
tbl
| invoke time_weighted_val_fl('ts', 'val', 'key', stime, etime, dt, keep_orig=true)
| invoke time_weighted_val_fl('ts', 'val', 'key', stime, etime, dt)
| project-rename val = _twa_val
| order by _key asc, _ts asc
```
Expand All @@ -206,4 +205,4 @@ tbl
| 2021-04-26 00:30:00.0000000 | Device2 | 400 | 1 |
| 2021-04-26 01:00:00.0000000 | Device2 | 450 | 0 |
| 2021-04-26 01:30:00.0000000 | Device2 | 500 | 1 |
| 2021-04-26 01:45:00.0000000 | Device2 | 300 | 1 |
| 2021-04-26 01:45:00.0000000 | Device2 | 300 | 1 |
28 changes: 13 additions & 15 deletions data-explorer/kusto/management/data-ingestion/ingest-from-query.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ These commands execute a query or a management command and ingest the results of

To cancel an ingest from query command, see [`cancel operation`](../cancel-operation-command.md).

[!INCLUDE [direct-ingestion-note](../../includes/direct-ingestion-note.md)]
> [!NOTE]
> Ingest from query is a [direct ingestion](/azure/data-explorer/ingest-data-overview#direct-ingestion-with-management-commands). As such, it does not include automatic retries. Automatic retries are available when ingesting through the data management service. Use the [ingestion overview](/azure/data-explorer/ingest-data-overview) document to decide which is the most suitable ingestion option for your scenario.
## Permissions

Expand All @@ -42,18 +43,22 @@ For more information on permissions, see [Kusto role-based access control](../..

|Name|Type|Required|Description|
|--|--|--|--|
| *async* | `string` | | If specified, the command will return and continue ingestion in the background. Use the returned `OperationId` with the `.show operations` command to retrieve the ingestion completion status and results. |
| *async* | `string` | | If specified, the command returns immediately and continues ingestion in the background. Use the returned `OperationId` with the `.show operations` command to retrieve the ingestion completion status and results. |
| *tableName* | `string` | :heavy_check_mark: | The name of the table to ingest data into. The *tableName* is always related to the database in context. |
| *propertyName*, *propertyValue* | `string` | | One or more [supported ingestion properties](#supported-ingestion-properties) used to control the ingestion process. |
| *queryOrCommand* | `string` | :heavy_check_mark: | The text of a query or a management command whose results are used as data to ingest.|
| *queryOrCommand* | `string` | :heavy_check_mark: | The text of a query or a management command whose results are used as data to ingest. Only `.show` management commands are supported.|

> [!NOTE]
> Only `.show` management commands are supported.
## Performance tips

* Set the `distributed` property to `true` if the amount of data produced by the query is large, exceeds 1 GB, and doesn't require serialization. Then, multiple nodes can produce output in parallel. Don't use this flag when query results are small, since it might needlessly generate many small data shards.
* Data ingestion is a resource-intensive operation that might affect concurrent activities on the database, including running queries. Avoid running too many ingestion commands at the same time.
* Limit the data for ingestion to less than 1 GB per ingestion operation. If necessary, use multiple ingestion commands.

## Supported ingestion properties

|Property|Type|Description|
|--|--|--|
|`distributed` | `bool` | If `true`, the command ingests from all nodes executing the query in parallel. Default is `false`. See [performance tips](#performance-tips).|
|`creationTime` | `string` | The datetime value, formatted as an ISO8601 string, to use at the creation time of the ingested data extents. If unspecified, `now()` is used. When specified, make sure the `Lookback` property in the target table's effective [Extents merge policy](../merge-policy.md) is aligned with the specified value.|
|`extend_schema` | `bool` | If `true`, the command may extend the schema of the table. Default is `false`. This option applies only to `.append`, `.set-or-append`, and `set-or-replace` commands. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.|
|`recreate_schema` | `bool` | If `true`, the command may recreate the schema of the table. Default is `false`. This option applies only to the `.set-or-replace` command. This option takes precedence over the `extend_schema` property if both are set. This option requires at least [Table Admin](../../access-control/role-based-access-control.md) permissions.|
Expand All @@ -62,7 +67,6 @@ For more information on permissions, see [Kusto role-based access control](../..
|`policy_ingestiontime` | `bool` | If `true`, the [Ingestion Time Policy](../show-table-ingestion-time-policy-command.md) will be enabled on the table. The default is `true`.|
|`tags` | `string` | A JSON string that represents a list of [tags](../extent-tags.md) to associate with the created extent. |
|`docstring` | `string` | A description used to document the table.|
|`distributed` | `bool` | If `true`, the command ingests from all nodes executing the query in parallel. Default is `false`. See [performance tips](#performance-tips).|
|`persistDetails` |A Boolean value that, if specified, indicates that the command should persist the detailed results for retrieval by the [.show operation details](../show-operations.md) command. Defaults to `false`. |`with (persistDetails=true)`|

## Schema considerations
Expand All @@ -74,15 +78,9 @@ For more information on permissions, see [Kusto role-based access control](../..
> [!CAUTION]
> If the schema is modified, it happens in a separate transaction before the actual data ingestion. This means the schema may be modified even when there is a failure to ingest the data.
## Performance tips

* Data ingestion is a resource-intensive operation that might affect concurrent activities on the database, including running queries. Avoid running too many ingestion commands at the same time.
* Limit the data for ingestion to less than 1 GB per ingestion operation. If necessary, use multiple ingestion commands.
* Set the `distributed` flag to `true` if the amount of data being produced by the query is large, exceeds 1 GB, and doesn't require serialization. Then, multiple nodes can produce output in parallel. Don't use this flag when query results are small, since it might needlessly generate many small data shards.

## Character limitation

The command will fail if the query generates an entity name with the `$` character. The [entity names](../../query/schema-entities/entity-names.md) must comply with the naming rules, so the `$` character must be removed for the ingest command to succeed.
The command fails if the query generates an entity name with the `$` character. The [entity names](../../query/schema-entities/entity-names.md) must comply with the naming rules, so the `$` character must be removed for the ingest command to succeed.

For example, in the following query, the `search` operator generates a column `$table`. To store the query results, use [project-rename](../../query/project-rename-operator.md) to rename the column.

Expand All @@ -100,7 +98,7 @@ Create a new table called :::no-loc text="RecentErrors"::: in the database that
| where Level == "Error" and Timestamp > now() - time(1h)
```

Create a new table called "OldExtents" in the database that has a single column, "ExtentId", and holds the extent IDs of all extents in the database that has been created more than 30 days earlier. The database has an existing table named "MyExtents". Since the dataset is expected to be bigger than 1 GB (more than ~1 million rows) use the *distributed* flag
Create a new table called "OldExtents" in the database that has a single column, "ExtentId", and holds the extent IDs of all extents in the database that were created more than 30 days ago. The database has an existing table named "MyExtents". Since the dataset is expected to be bigger than 1 GB (more than ~1 million rows) use the *distributed* flag

```kusto
.set async OldExtents with(distributed=true) <|
Expand Down Expand Up @@ -137,7 +135,7 @@ Replace the data in the "OldExtents" table in the current database, or create th
| project ExtentId
```

Append data to the "OldExtents" table in the current database, while setting the created extent(s) creation time to a specific datetime in the past.
Append data to the "OldExtents" table in the current database, while setting the extents creation time to a specific datetime in the past.

```kusto
.append async OldExtents with(creationTime='2017-02-13T11:09:36.7992775Z') <|
Expand Down

0 comments on commit 3bbb53f

Please sign in to comment.