Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo sync for protected branch #2474

Merged
merged 10 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 8 additions & 33 deletions data-explorer/fluent-bit.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,53 +3,28 @@ title: Ingest data with Fluent Bit into Azure Data Explorer
description: Learn how to ingest (load) data into Azure Data Explorer from Fluent Bit.
ms.reviewer: ramacg
ms.topic: how-to
ms.date: 06/27/2024
ms.date: 12/02/2024
---

# Ingest data with Fluent Bit into Azure Data Explorer

[Fluent Bit](https://github.com/fluent/fluent-bit/tree/master) is an open-source agent that collects logs, metrics, and traces from various sources. It allows you to filter, modify, and aggregate event data before sending it to storage. Azure Data Explorer is a fast and highly scalable data exploration service for log and telemetry data. This article guides you through the process of using Fluent Bit to send data to Azure Data Explorer.

In this article, you'll learn how to:

> [!div class="checklist"]
>
> * [Create a table to store your logs](#create-a-table-to-store-your-logs)
> * [Register a Microsoft Entra app with permissions to ingest data](#register-a-microsoft-entra-app-with-permissions-to-ingest-data)
> * [Configure Fluent Bit to send logs to your table](#configure-fluent-bit-to-send-logs-to-your-table)
> * [Verify that data has landed in your table](#verify-that-data-has-landed-in-your-table)
[!INCLUDE [fluent-bit](includes/cross-repo/fluent-bit.md)]

For a complete list of data connectors, see [Data connectors overview](integrate-overview.md).

## Prerequisites

* [Fluent Bit](https://docs.fluentbit.io/manual/installation/getting-started-with-fluent-bit).
* An Azure Data Explorer cluster and database. [Create a cluster and database](create-cluster-and-database.md).

You can use any of the available [Query tools](integrate-query-overview.md) for your query environment.

[!INCLUDE [fluent-bit](includes/cross-repo/fluent-bit.md)]

## Configure Fluent Bit to send logs to your table

To configure Fluent Bit to send logs to your Azure Data Explorer table, create a [classic mode](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file) or [YAML mode](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/yaml/configuration-file) configuration file with the following output properties:

| Field | Description |
| --------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Name | `azure_kusto` |
| Match | A pattern to match against the tags of incoming records. It's case-sensitive and supports the star (`*`) character as a wildcard. |
| Tenant_Id | **Directory (tenant) ID** from [Register a Microsoft Entra app with permissions to ingest data](#register-a-microsoft-entra-app-with-permissions-to-ingest-data). |
| Client_Id | **Application (client) ID** from [Register a Microsoft Entra app with permissions to ingest data](#register-a-microsoft-entra-app-with-permissions-to-ingest-data). |
| Client_Secret | The client secret key value [Register a Microsoft Entra app with permissions to ingest data](#register-a-microsoft-entra-app-with-permissions-to-ingest-data). |
| Ingestion_Endpoint | Use the **Data Ingestion URI** found in the [Azure portal](https://ms.portal.azure.com/) under your cluster overview. |
| Database_Name | The name of the database that contains your logs table. |
| Table_Name | The name of the table from [Create a table to store your logs](#create-a-table-to-store-your-logs). |
| Ingestion_Mapping_Reference | The name of the ingestion mapping from [Create a table](#create-a-table-to-store-your-logs). If you didn't create an ingestion mapping, remove the property from the configuration file. |

To see an example configuration file, select the relevant tab:
* A query environment. For more information, see [Query integrations overview](integrate-query-overview.md). <a id=ingestion-uri></a>
* Your Kusto cluster URI for the *Ingestion_endpoint* value in the format *https://ingest-\<cluster>.\<region>.kusto.windows.net*. For more information, see [Add a cluster connection](add-cluster-connection.md#add-a-cluster-connection).

[!INCLUDE [fluent-bit-2](includes/cross-repo/fluent-bit-2.md)]

<!--[!INCLUDE [fluent-bit-3](includes/cross-repo/fluent-bit-3.md)]-->

## Related content

* [Data integrations overview](integrate-data-overview.md)
* [Kusto Query Language (KQL) overview](/kusto/query/)
* [Write queries](/kusto/query/tutorials/learn-common-operators?view=azure-data-explorer&preserve-view=true)
137 changes: 115 additions & 22 deletions data-explorer/includes/cross-repo/fluent-bit-2.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,94 @@
---
ms.topic: include
ms.date: 06/27/2024
ms.date: 12/03/2024
---
## Create a Microsoft Entra service principal

The Microsoft Entra service principal can be created through the [Azure portal](/azure/active-directory/develop/howto-create-service-principal-portal) or programmatically, as in the following example.

This service principal is the identity used by the connector to write data to your table in Kusto. You grant permissions for this service principal to access Kusto resources.

[!INCLUDE [entra-service-principal](../entra-service-principal.md)]

## Create a target table

Fluent Bit forwards logs in JSON format with three properties: `log` ([dynamic](/azure/data-explorer/kusto/query/scalar-data-types/dynamic)), `tag` ([string](/azure/data-explorer/kusto/query/scalar-data-types/string)), and `timestamp` ([datetime](/azure/data-explorer/kusto/query/scalar-data-types/datetime)).

You can create a table with columns for each of these properties. Alternatively, if you have structured logs, you can create a table with log properties mapped to custom columns. To learn more, select the relevant tab.

### [Default schema](#tab/default)

To create a table for incoming logs from Fluent Bit:

1. Browse to your query environment.
1. Select the database where you'd like to create the table.
1. Run the following [`.create table` command](/azure/data-explorer/kusto/management/create-table-command):

```kusto
.create table FluentBitLogs (log:dynamic, tag:string, timestamp:datetime)
```

The incoming JSON properties are automatically mapped into the correct column.

### [Custom schema](#tab/custom)

To create a table for incoming structured logs from Fluent Bit:

1. Browse to your query environment.
1. Select the database where you'd like to create the table.
1. Run the [`.create table` command](/azure/data-explorer/kusto/management/create-table-command). For example, if your logs contain three fields named `myString`, `myInteger`, and `myDynamic`, you can create a table with the following schema:

```kusto
.create table FluentBitLogs (myString:string, myInteger:int, myDynamic: dynamic, timestamp:datetime)
```

1. Create a [JSON mapping](/azure/data-explorer/kusto/management/mappings) to map log properties to the appropriate columns. The following command creates a mapping based on the example in the previous step:

```kusto
.create-or-alter table FluentBitLogs ingestion json mapping "LogMapping"
```[
{"column" : "myString", "datatype" : "string", "Properties":{"Path":"$.log.myString"}},
{"column" : "myInteger", "datatype" : "int", "Properties":{"Path":"$.log.myInteger"}},
{"column" : "myDynamic", "datatype" : "dynamic", "Properties":{"Path":"$.log.myInteger"}},
{"column" : "timestamp", "datatype" : "datetime", "Properties":{"Path":"$.timestamp"}}
]```
```

---

## Grant permissions to the service principal

Grant the service principal from [Create a Microsoft Entra service principal](#create-a-microsoft-entra-service-principal) [database ingestor](/azure/data-explorer/kusto/access-control/role-based-access-control) role permissions to work with the database. For more information, see [Examples](/azure/data-explorer/kusto/management/manage-database-security-roles). Replace the placeholder *DatabaseName* with the name of the target database and *ApplicationID* with the `AppId` value you saved when creating a Microsoft Entra service principal.

```kusto
.add database <DatabaseName> ingestors ('aadapp=<ApplicationID>;<TenantID>')
```

## Configure Fluent Bit to send logs to your table

To configure Fluent Bit to send logs to your table in Kusto, create a [classic mode](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file) or [YAML mode](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/yaml/configuration-file) configuration file with the following output properties:

| Field | Description | Required | Default |
|--|--|--|--|
| Name | The pipeline name. | | `azure_kusto`|
| tenant_id | The tenant ID from [Create a Microsoft Entra service principal](#create-a-microsoft-entra-service-principal). | :heavy_check_mark: | |
| client_id | The application ID from [Create a Microsoft Entra service principal](#create-a-microsoft-entra-service-principal). | :heavy_check_mark: | |
| client_secret | The client secret key value (password) from [Create a Microsoft Entra service principal](#create-a-microsoft-entra-service-principal). | :heavy_check_mark: | |
| ingestion_endpoint | Enter the value as described for [Ingestion_Endpoint](#ingestion-uri). | :heavy_check_mark: | |
| database_name | The name of the database that contains your logs table. | :heavy_check_mark: | |
| table_name | The name of the table from [Create a target table](#create-a-target-table). | :heavy_check_mark: | |
| ingestion_mapping_reference | The name of the ingestion mapping from [Create a target table](#create-a-target-table). If you didn't create an ingestion mapping, remove the property from the configuration file. | | |
| log_key | Key name of the log content. For instance, `log`. | | `log` |
| tag_key | The key name of tag. Ignored if `include_tag_key` is false. | | `tag` |
| include_time_key | A timestamp is appended to output, if enabled. Uses the `time_key` property. | | `true` |
| time_key | The key name for the timestamp in the log records. Ignored if `include_time_key` false. | | `timestamp` |
| ingestion_endpoint_connect_timeout | The connection timeout of various Kusto endpoints in seconds. | | `60s` |
| compression_enabled | Sends compressed HTTP payload (gzip) to Kusto, if enabled. | | `true` |
| ingestion_resources_refresh_interval | The ingestion resources refresh interval of Kusto endpoint in seconds. | | `3600` |
| workers | The number of [workers](https://docs.fluentbit.io/manual/administration/multithreading#outputs) to perform flush operations for this output. | | `0` |

To see an example configuration file, select the relevant tab:

### [Classic mode](#tab/classic)

```txt
Expand All @@ -23,14 +110,18 @@ ms.date: 06/27/2024
Refresh_Interval 10

[OUTPUT]
Name azure_kusto
Match *
Tenant_Id azure-tenant-id
Client_Id azure-client-id
Client_Secret azure-client-secret
Ingestion_Endpoint azure-data-explorer-ingestion-endpoint
Database_Name azure-data-explorer-database-name
Table_Name azure-data-explorer-table-name
match *
name azure_kusto
tenant_id <TenantId>
client_id <ClientId>
client_secret <AppSecret>
ingestion_endpoint <IngestionEndpoint>
database_name <DatabaseName>
table_name <TableName>
ingestion_mapping_reference <MappingName>
ingestion_endpoint_connect_timeout <IngestionEndpointConnectTimeout>
compression_enabled <CompressionEnabled>
ingestion_resources_refresh_interval <IngestionResourcesRefreshInterval>
```

### [YAML mode](#tab/yaml)
Expand Down Expand Up @@ -69,23 +160,25 @@ config:

outputs: |
[OUTPUT]
Name azure_kusto
Match *
Tenant_Id azure-tenant-id
Client_Id azure-client-id
Client_Secret azure-client-secret
Ingestion_Endpoint azure-data-explorer-ingestion-endpoint
Database_Name azure-data-explorer-database-name
Table_Name azure-data-explorer-table-name
match *
name azure_kusto
tenant_id <TenantId>
client_id <ClientId>
client_secret <AppSecret>
ingestion_endpoint <IngestionEndpoint>
database_name <DatabaseName>
table_name <TableName>
ingestion_mapping_reference <MappingName>
ingestion_endpoint_connect_timeout <IngestionEndpointConnectTimeout>
compression_enabled <CompressionEnabled>
ingestion_resources_refresh_interval <IngestionResourcesRefreshInterval>
```

---

## Verify that data has landed in your table
## Confirm data ingestion

Once the configuration is complete, logs should arrive in your table.

1. To verify that logs are ingested, run the following query:
1. Once data arrives in the table, confirm the transfer of data, by checking the row count:

```Kusto
FluentBitLogs
Expand All @@ -97,4 +190,4 @@ Once the configuration is complete, logs should arrive in your table.
```Kusto
FluentBitLogs
| take 100
```
```
106 changes: 106 additions & 0 deletions data-explorer/includes/cross-repo/fluent-bit-3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
ms.topic: include
ms.date: 12/01/2024
---
### [Classic mode](#tab/classic)

```txt
[SERVICE]
Daemon Off
Flush 1
Log_Level trace
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On

[INPUT]
Name tail
Path /var/log/containers/*.log
Tag kube.*
Mem_Buf_Limit 1MB
Skip_Long_Lines On
Refresh_Interval 10

[OUTPUT]
match *
name azure_kusto
tenant_id <app_tenant_id>
client_id <app_client_id>
client_secret <app_secret>
ingestion_endpoint <ingestion_endpoint>
database_name <database_name>
table_name <table_name>
ingestion_mapping_reference <mapping_name>
ingestion_endpoint_connect_timeout <ingestion_endpoint_connect_timeout>
compression_enabled <compression_enabled>
ingestion_resources_refresh_interval <ingestion_resources_refresh_interval>
```

### [YAML mode](#tab/yaml)

```yaml
config:
service: |
[SERVICE]
Daemon Off
Flush 1
Log_Level trace
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On

inputs: |
[INPUT]
Name tail
Path /var/log/containers/*.log
multiline.parser docker, cri
Tag kube.*
Mem_Buf_Limit 1MB
Skip_Long_Lines On
Refresh_Interval 10

filters: |
[FILTER]
Name kubernetes
Match kube.*
Merge_Log On
Merge_Log_key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off


outputs: |
[OUTPUT]
match *
name azure_kusto
tenant_id <app_tenant_id>
client_id <app_client_id>
client_secret <app_secret>
ingestion_endpoint <ingestion_endpoint>
database_name <database_name>
table_name <table_name>
ingestion_mapping_reference <mapping_name>
ingestion_endpoint_connect_timeout <ingestion_endpoint_connect_timeout>
compression_enabled <compression_enabled>
ingestion_resources_refresh_interval <ingestion_resources_refresh_interval>
```

---

## Confirm data ingestion

1. Once data arrives in the table, confirm the transfer of data, by checking the row count:

```Kusto
FluentBitLogs
| count
```

1. To view a sample of log data, run the following query:

```Kusto
FluentBitLogs
| take 100
```
Loading
Loading