Skip to content

Commit

Permalink
workflow example
Browse files Browse the repository at this point in the history
  • Loading branch information
BerndDoser committed Oct 14, 2024
1 parent 7e64b1d commit 684efcd
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 9 deletions.
12 changes: 12 additions & 0 deletions flyte.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,18 @@ workflow_outputs = typing.NamedTuple(
@task
def generate_processed_corpus() -> List[List[str]]:
@task
def train_word2vec_model(training_data: List[List[str]], hyperparams: Word2VecModelHyperparams) -> model_file:
@task
def train_lda_model(corpus: List[List[str]], hyperparams: LDAModelHyperparams) -> Dict[int, List[str]]:
@task
def word_similarities(model_ser: FlyteFile[MODELSER_NLP], word: str) -> Dict[str, float]:
@task
def word_movers_distance(model_ser: FlyteFile[MODELSER_NLP]) -> float:
@workflow
def nlp_workflow(target_word: str = "computer") -> [Dict[str, float], float, Dict[int, List[str]]]:
corpus = generate_processed_corpus()
Expand Down
4 changes: 4 additions & 0 deletions images/ml-workflow-example.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 10 additions & 9 deletions workflows.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -67,19 +67,20 @@ K. L. Polsterer, B. Doser, A. Fehlner and S. Trujillo-Gomez [ADASS (2024)]().

## Requirements on Workflows Orchestration

- Define node requirements (e.g. CPU, memory, GPU)
- Define execution requirements (e.g. GPUs, CPUs, memory)
- Control runtime environment with containers
- Underlying data pipeline?
- Orchestration features
- Parallelization: Run independent tasks automatically in parallel
- Caching: Avoid recomputing successful tasks
- Nesting: Reuse workflows as tasks
- Looping: Repeat tasks based on conditions
- Scattering: Distribute data to multiple tasks
- Conditionals: Branching based on conditions

- Parallelization: Run independent tasks automatically in parallel
- Caching: Avoid recomputing successful tasks
- Nesting: Reuse workflows as tasks
- Looping: Repeat tasks based on conditions
- Scattering: Distribute data to multiple tasks
- Conditionals: Branching based on conditions

## Machine Learning Workflow Example

![](images/flyte-ui_mnist-workflow.png)
![](images/ml-workflow-example.svg){fig-align="center"}


## Options to Generate a Workflow
Expand Down

0 comments on commit 684efcd

Please sign in to comment.