CMF fluent API + Ray runner [collecting feedback] #68
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
This PR implements one possible version of what a CMF fluent API can look like. It tries to achieve the following goals:
log_dataset
).Example
Assuming a user has developed four functions -
fetch
,preprocess
,train
andtest
, the following is the example of CMF fluent API:API methods
Fluent API methods are categorized into three buckets:
set_cmf_parameters
). These parameters control CMF initialization, and do not include information about pipelines, steps and executions.start_step
andend_step
). These methods start a new pipeline step and ends currently active pipeline steps. Thestart_step
method returns an instance of theStep
class that can be used as a python context manager to automatically end steps.log_dataset
,log_dataset_with_version
,log_model
,log_execution_metrics
,log_metric
andlog_validation_output
). These methods log input/output artifacts. When these methods accept artifact URL, users can provide a string or a Path object, e.g.:Ray runner
This PR also contains an example of how CMF pipelines run on Ray clusters. This is possible since the fluent API can initialize the CMF using environment variables: