-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
GitHub Action Website Snapshot
committed
Nov 6, 2024
1 parent
ea723d2
commit 3038e13
Showing
240 changed files
with
13,473 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "Client Libraries", | ||
"position": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "Java", | ||
"position": 1 | ||
} |
118 changes: 118 additions & 0 deletions
118
versioned_docs/version-1.24.1/client/java/configuration.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
--- | ||
sidebar_position: 2 | ||
title: Configuration | ||
--- | ||
|
||
We recommend configuring the client with an `openlineage.yml` file that contains all the | ||
details of how to connect to your OpenLineage backend. | ||
|
||
See [example configurations.](#transports) | ||
|
||
You can make this file available to the client in three ways (the list also presents precedence of the configuration): | ||
|
||
1. Set an `OPENLINEAGE_CONFIG` environment variable to a file path: `OPENLINEAGE_CONFIG=path/to/openlineage.yml`. | ||
2. Place an `openlineage.yml` in the user's current working directory. | ||
3. Place an `openlineage.yml` under `.openlineage/` in the user's home directory (`~/.openlineage/openlineage.yml`). | ||
|
||
## Environment Variables | ||
|
||
The following environment variables are available: | ||
|
||
| Name | Description | Since | | ||
|----------------------|-----------------------------------------------------------------------------|-------| | ||
| OPENLINEAGE_CONFIG | The path to the YAML configuration file. Example: `path/to/openlineage.yml` | | | ||
| OPENLINEAGE_DISABLED | When `true`, OpenLineage will not emit events. | 0.9.0 | | ||
|
||
You can also configure the client with dynamic environment variables. | ||
|
||
import DynamicEnvVars from './partials/java_dynamic_env_vars.md'; | ||
|
||
<DynamicEnvVars/> | ||
|
||
## Facets Configuration | ||
|
||
In YAML configuration file you can also disable facets to filter them out from the OpenLineage event. | ||
|
||
*YAML Configuration* | ||
|
||
```yaml | ||
transport: | ||
type: console | ||
facets: | ||
spark_unknown: | ||
disabled: true | ||
spark: | ||
logicalPlan: | ||
disabled: true | ||
``` | ||
### Deprecated syntax | ||
The following syntax is deprecated and soon will be removed: | ||
```yaml | ||
transport: | ||
type: console | ||
facets: | ||
disabled: | ||
- spark_unknown | ||
- spark.logicalPlan | ||
``` | ||
The rationale behind deprecation is that some of the facets were disabled by default in some integrations. When we added | ||
something extra but didn't include the defaults, they were unintentionally enabled. | ||
## Transports | ||
import Transports from './partials/java_transport.md'; | ||
<Transports/> | ||
### Error Handling via Transport | ||
```java | ||
// Connect to http://localhost:5000 | ||
OpenLineageClient client = OpenLineageClient.builder() | ||
.transport( | ||
HttpTransport.builder() | ||
.uri("http://localhost:5000") | ||
.apiKey("f38d2189-c603-4b46-bdea-e573a3b5a7d5") | ||
.build()) | ||
.registerErrorHandler(new EmitErrorHandler() { | ||
@Override | ||
public void handleError(Throwable throwable) { | ||
// Handle emit error here | ||
} | ||
}).build(); | ||
``` | ||
|
||
### Defining Your Own Transport | ||
|
||
```java | ||
OpenLineageClient client = OpenLineageClient.builder() | ||
.transport( | ||
new MyTransport() { | ||
@Override | ||
public void emit(OpenLineage.RunEvent runEvent) { | ||
// Add emit logic here | ||
} | ||
}).build(); | ||
``` | ||
|
||
## Circuit Breakers | ||
|
||
import CircuitBreakers from './partials/java_circuit_breaker.md'; | ||
|
||
<CircuitBreakers/> | ||
|
||
## Metrics | ||
|
||
import Metrics from './partials/java_metrics.md'; | ||
|
||
<Metrics/> | ||
|
||
## Dataset Namespace Resolver | ||
|
||
import DatasetNamespaceResolver from './partials/java_namespace_resolver.md'; | ||
|
||
<DatasetNamespaceResolver/> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
--- | ||
sidebar_position: 5 | ||
--- | ||
|
||
# Java | ||
|
||
## Overview | ||
|
||
The OpenLineage Java is a SDK for Java programming language that users can use to generate and emit OpenLineage events to OpenLineage backends. | ||
The core data structures currently offered by the client are the `RunEvent`, `RunState`, `Run`, `Job`, `Dataset`, | ||
and `Transport` classes, along with various `Facets` that can come under run, job, and dataset. | ||
|
||
There are various [transport classes](#transports) that the library provides that carry the lineage events into various target endpoints (e.g. HTTP). | ||
|
||
You can also use the Java client to create your own custom integrations. | ||
|
||
## Installation | ||
|
||
Java client is provided as library that can either be imported into your Java project using Maven or Gradle. | ||
|
||
Maven: | ||
|
||
```xml | ||
<dependency> | ||
<groupId>io.openlineage</groupId> | ||
<artifactId>openlineage-java</artifactId> | ||
<version>${OPENLINEAGE_VERSION}</version> | ||
</dependency> | ||
``` | ||
|
||
or Gradle: | ||
|
||
```groovy | ||
implementation("io.openlineage:openlineage-java:${OPENLINEAGE_VERSION}") | ||
``` | ||
|
||
For more information on the available versions of the `openlineage-java`, | ||
please refer to the [maven repository](https://search.maven.org/artifact/io.openlineage/openlineage-java). | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
107 changes: 107 additions & 0 deletions
107
versioned_docs/version-1.24.1/client/java/partials/java_circuit_breaker.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
import Tabs from '@theme/Tabs'; | ||
import TabItem from '@theme/TabItem'; | ||
|
||
:::info | ||
This feature is available in OpenLineage versions >= 1.9.0. | ||
::: | ||
|
||
To prevent from over-instrumentation OpenLineage integration provides a circuit breaker mechanism | ||
that stops OpenLineage from creating, serializing and sending OpenLineage events. | ||
|
||
### Simple Memory Circuit Breaker | ||
|
||
Simple circuit breaker which is working based only on free memory within JVM. Configuration should | ||
contain free memory threshold limit (percentage). Default value is `20%`. The circuit breaker | ||
will close within first call if free memory is low. `circuitCheckIntervalInMillis` parameter is used | ||
to configure a frequency circuit breaker is called. Default value is `1000ms`, when no entry in config. | ||
`timeoutInSeconds` is optional. If set, OpenLineage code execution is terminated when a timeout | ||
is reached (added in version 1.13). | ||
|
||
<Tabs groupId="integrations"> | ||
<TabItem value="yaml" label="Yaml Config"> | ||
|
||
```yaml | ||
circuitBreaker: | ||
type: simpleMemory | ||
memoryThreshold: 20 | ||
circuitCheckIntervalInMillis: 1000 | ||
timeoutInSeconds: 90 | ||
``` | ||
</TabItem> | ||
<TabItem value="spark" label="Spark Config"> | ||
| Parameter | Definition | Example | | ||
--------------------------------------|----------------------------------------------------------------|-------------- | ||
| spark.openlineage.circuitBreaker.type | Circuit breaker type selected | simpleMemory | | ||
| spark.openlineage.circuitBreaker.memoryThreshold | Memory threshold | 20 | | ||
| spark.openlineage.circuitBreaker.circuitCheckIntervalInMillis | Frequency of checking circuit breaker | 1000 | | ||
| spark.openlineage.circuitBreaker.timeoutInSeconds | Optional timeout for OpenLineage execution (Since version 1.13)| 90 | | ||
</TabItem> | ||
<TabItem value="flink" label="Flink Config"> | ||
| Parameter | Definition | Example | | ||
--------------------------------------|---------------------------------------------|------------- | ||
| openlineage.circuitBreaker.type | Circuit breaker type selected | simpleMemory | | ||
| openlineage.circuitBreaker.memoryThreshold | Memory threshold | 20 | | ||
| openlineage.circuitBreaker.circuitCheckIntervalInMillis | Frequency of checking circuit breaker | 1000 | | ||
| spark.openlineage.circuitBreaker.timeoutInSeconds | Optional timeout for OpenLineage execution (Since version 1.13) | 90 | | ||
</TabItem> | ||
</Tabs> | ||
### Java Runtime Circuit Breaker | ||
More complex version of circuit breaker. The amount of free memory can be low as long as | ||
amount of time spent on Garbage Collection is acceptable. `JavaRuntimeCircuitBreaker` closes | ||
when free memory drops below threshold and amount of time spent on garbage collection exceeds | ||
given threshold (`10%` by default). The circuit breaker is always open when checked for the first time | ||
as GC threshold is computed since the previous circuit breaker call. | ||
`circuitCheckIntervalInMillis` parameter is used | ||
to configure a frequency circuit breaker is called. | ||
Default value is `1000ms`, when no entry in config. | ||
`timeoutInSeconds` is optional. If set, OpenLineage code execution is terminated when a timeout | ||
is reached (added in version 1.13). | ||
|
||
<Tabs groupId="integrations"> | ||
<TabItem value="yaml" label="Yaml Config"> | ||
|
||
```yaml | ||
circuitBreaker: | ||
type: javaRuntime | ||
memoryThreshold: 20 | ||
gcCpuThreshold: 10 | ||
circuitCheckIntervalInMillis: 1000 | ||
timeoutInSeconds: 90 | ||
``` | ||
</TabItem> | ||
<TabItem value="spark" label="Spark Config"> | ||
|
||
| Parameter | Definition | Example | | ||
--------------------------------------|---------------------------------------|------------- | ||
| spark.openlineage.circuitBreaker.type | Circuit breaker type selected | javaRuntime | | ||
| spark.openlineage.circuitBreaker.memoryThreshold | Memory threshold | 20 | | ||
| spark.openlineage.circuitBreaker.gcCpuThreshold | Garbage Collection CPU threshold | 10 | | ||
| spark.openlineage.circuitBreaker.circuitCheckIntervalInMillis | Frequency of checking circuit breaker | 1000 | | ||
| spark.openlineage.circuitBreaker.timeoutInSeconds | Optional timeout for OpenLineage execution (Since version 1.13)| 90 | | ||
|
||
|
||
</TabItem> | ||
<TabItem value="flink" label="Flink Config"> | ||
|
||
| Parameter | Definition | Example | | ||
--------------------------------------|---------------------------------------|------------- | ||
| openlineage.circuitBreaker.type | Circuit breaker type selected | javaRuntime | | ||
| openlineage.circuitBreaker.memoryThreshold | Memory threshold | 20 | | ||
| openlineage.circuitBreaker.gcCpuThreshold | Garbage Collection CPU threshold | 10 | | ||
| openlineage.circuitBreaker.circuitCheckIntervalInMillis | Frequency of checking circuit breaker | 1000 | | ||
| spark.openlineage.circuitBreaker.timeoutInSeconds | Optional timeout for OpenLineage execution (Since version 1.13) | 90 | | ||
|
||
|
||
</TabItem> | ||
</Tabs> | ||
|
||
### Custom Circuit Breaker | ||
|
||
List of available circuit breakers can be extended with custom one loaded via ServiceLoader | ||
with own implementation of `io.openlineage.client.circuitBreaker.CircuitBreakerBuilder`. |
Oops, something went wrong.