Support for Argo backend #19

Nilabhra · 2022-11-14T07:11:33Z

Additions:

Support for executing pipelines as Argo Workflows on Kubernetes.
Pipelines would be represented as DAGs, making pipeline representations generic.
.dockerignore file to filter out unneeded files from copied over to the container. Currently, everything in the project root get's copied over, including inputs and outputs.

Changes:

Modification of the dockerize module to support tagging + pushing a docker image to Docker Hub. For local deployments, later, support for a local docker registry needs to be added.
Modification of the example.py file to not accept the transaction model file name directly. The assumption now is that the user would provide the path to the directory where the pre-trained model binary is present. This had to be done as Kubernetes doesn't allow mounting of files as volumes. This modification seemed reasonable to me since most saved models use a unique directory name with a generic name for the weights binary (such as model.bin or weights.pt).
I was unable to get the shell scripts running as they were. I decided to just use python, which seems to work for me and @nikhil.
The dockerize module avoids setting the entry point to conda run -n pircli. While this works perfectly for docker-compose, it fails to work for Argo. I decided to just set the PATH env var to the location of the python binary for the pircli environment. That worked both for docker and for Argo.

Closes:

Add Argo Workflow backend #18

Comments:

I have tested out both the Argo backend the existing Docker backend and ensure they both work with the recent changes.

Commit history:

add: subparser for argoize
add: Nikhil's changes.
feat: generating workflow name using output file name.
feat: function to create Argo template from node.
chore: added docstring.
chore: added comments.
feat: setting the env vars before generation of the argo yaml.
feat: function to generate NFS volume specs for K8s.
feat: added support for mounting NFS volumes.
fix: using absolute path for the NFS mount paths.
fix: typo.
feat: function for generating Argo templates for PIRlib Graph objects.
chore: added comments.
feat: enabled creation of DAG tasks.
chore: added Argo specific refactoring.
chore: refactored argo task names.
fix: files cannot be mounted as volumes in Argo/K8s.
chore: removed redundant module.
chore: debugging code.
chore: optimized docker image generation.
add: dockerignore file to prevent copying of unnecessary files.
fix: missing volume mount.
chore: reverted to original formatting.
chore: removed spurious code.
chore: increased black linelength to 100.

* add: subparser for argoize * add: Nikhil's changes. * feat: generating workflow name using output file name. * feat: function to create Argo template from node. * chore: added docstring. * chore: added comments. * feat: setting the env vars before generation of the argo yaml. * feat: function to generate NFS volume specs for K8s. * feat: added support for mounting NFS volumes. * fix: using absolute path for the NFS mount paths. * fix: typo. * feat: function for generating Argo templates for PIRlib Graph objects. * chore: added comments. * feat: enabled creation of DAG tasks. * chore: added Argo specific refactoring. * chore: refactored argo task names. * fix: files cannot be mounted as volumes in Argo/K8s. * chore: removed redundant module. * chore: debugging code. * chore: optimized docker image generation. * add: dockerignore file to prevent copying of unnecessary files. * fix: missing volume mount. * chore: reverted to original formatting. * chore: removed spurious code. * chore: increased black linelength to 100.

add: `wiki_parser` example from Forte

Revert "add: `wiki_parser` example from Forte"

zhanyuanucb · 2022-11-29T06:01:57Z

Can we include the requirements to run the new Argo example?

Also, we can include the generated YAML files from the new example, like docker and local backends

If this makes the example dir looks messy, we can organize different example in different folders.

pirlib/backends/argo_batch.py

pirlib/cli/dockerize.py

zhanyuanucb · 2022-11-29T06:08:25Z

example/run_argo.sh

+
+### Module 1: Docker_Packaging
+python $ROOTDIR/bin/pircli dockerize \
+    --auto $ROOTDIR \


TBH, —auto $ROOTDIR is a bit confusing.
May be something like this will be better?

$ROOTDIR \ --auto

Nilabhra · 2022-11-29T07:33:23Z

@zhanyuanucb

Can we include the requirements to run the new Argo example?

Sure, will add that.

Also, we can include the generated YAML files from the new example, like docker and local backends
If this makes the example dir looks messy, we can organize different example in different folders.

The issue with this is that it would expose specific information about my development setup such as:

My DockerHub name.
The name of the DockerHub repo.
The file locations location of the inputs/outputs e.g. /home/nilabhra/pirlib/example/inputs/train_dataset.

None of this is sensitive, so i have no issue exposing this if you are fine with it.

zhanyuanucb · 2022-11-29T23:57:42Z

@Nilabhra
For those personal information, you can use something like xxx or @@@ to mask it out

@zhanyuanucb

Can we include the requirements to run the new Argo example?

Sure, will add that.

Also, we can include the generated YAML files from the new example, like docker and local backends
If this makes the example dir looks messy, we can organize different example in different folders.

The issue with this is that it would expose specific information about my development setup such as:

My DockerHub name.

The name of the DockerHub repo.

The file locations location of the inputs/outputs e.g. /home/nilabhra/pirlib/example/inputs/train_dataset.

None of this is sensitive, so i have no issue exposing this if you are fine with it.

Nilabhra · 2022-11-30T06:55:55Z

@zhanyuanucb

@Nilabhra
For those personal information, you can use something like xxx or @@@ to mask it out

Sure, just be aware that the YAML files would not work out of the box unlike the YAML file for docker-compose.

Nilabhra · 2022-12-05T09:06:49Z

@zhanyuanucb Up for another round of review.

zhanyuanucb · 2022-12-06T08:46:41Z

@Nilabhra Could you put all the examples together in one .md file so that reader can see the toy example running on different backends? Maybe later the same .md file can be extended by the wiki_parser example, which is shown as an advance example.

The rest looks good to me.

Nilabhra · 2022-12-06T09:04:33Z

@zhanyuanucb

Could you put all the examples together in one .md file so that reader can see the toy example running on different backends? Maybe later the same .md file can be extended by the wiki_parser example, which is shown as an advance example.

I can copy-paste some of the content of the README.rst in the docs directory. Would that be fine? Or should I remove the Examples section from existing README.rst too?

zhanyuanucb · 2022-12-06T16:33:30Z

@zhanyuanucb

Could you put all the examples together in one .md file so that reader can see the toy example running on different backends? Maybe later the same .md file can be extended by the wiki_parser example, which is shown as an advanced example.

I can copy-paste some of the content of the README.rst in the docs directory. Would that be fine? Or should I remove the Examples section from existing README.rst too?

@Nilabhra
Ah... my bad. I meant putting everything in the README.rst for now. Ideally, we can write documentation in .md and then use rst to organize the files, but this requires more work.

Nilabhra and others added 30 commits November 10, 2022 10:47

add: subparser for argoize

47eda1c

add: Nikhil's changes.

828e6d5

feat: generating workflow name using output file name.

5c72277

feat: function to create Argo template from node.

c394308

chore: added docstring.

09f1c1c

chore: added comments.

e42223c

feat: setting the env vars before generation of the argo yaml.

35cb1be

feat: function to generate NFS volume specs for K8s.

0b35916

feat: added support for mounting NFS volumes.

c1ff95a

fix: using absolute path for the NFS mount paths.

d579f24

fix: typo.

64396a3

feat: function for generating Argo templates for PIRlib Graph objects.

1a7c0a9

chore: added comments.

b901d3b

feat: enabled creation of DAG tasks.

6052ae4

chore: added Argo specific refactoring.

fcc6768

chore: refactored argo task names.

6316c87

fix: files cannot be mounted as volumes in Argo/K8s.

8a9b5fd

chore: removed redundant module.

95fd3dd

chore: debugging code.

d339da2

chore: optimized docker image generation.

1bc1cea

add: dockerignore file to prevent copying of unnecessary files.

99912ef

fix: missing volume mount.

d552227

chore: reverted to original formatting.

a1c26c9

chore: removed spurious code.

46b3ca1

chore: increased black linelength to 100.

ee19634

fix: type annotation.

2a60ba2

add: WIP files for one step execution of the wiki_parse pipeline.

49d1000

fix: type annotation.

01d33ff

add: WIP files for one step execution of the wiki_parse pipeline.

557d82b

Nilabhra added 3 commits November 24, 2022 09:18

Merge pull request #4 from Nilabhra/forte

66dea30

add: `wiki_parser` example from Forte

Revert "add: wiki_parser example from Forte"

ce78767

Merge pull request #5 from Nilabhra/revert-4-forte

793c048

Revert "add: `wiki_parser` example from Forte"

Nilabhra requested a review from zhanyuanucb November 28, 2022 05:52

Nilabhra self-assigned this Nov 28, 2022

zhanyuanucb reviewed Nov 29, 2022

View reviewed changes

Nilabhra added 3 commits November 29, 2022 11:54

feat: raising exception if required env vars are not defined.

f584c95

chore: refactored inp_name to inp_id.

6dbb27a

chore: reformatted shell command.

2f00467

Nilabhra requested a review from zhanyuanucb November 29, 2022 12:51

Nilabhra added 6 commits December 5, 2022 09:16

chore: reverted to inclusion of setting PYTHONPATH in the image.

333b4d2

feat: prepending env of python to .

c17156a

feat: changed base image to continuumio/miniconda3.

1ad74fe

add: generated YAML files.

535fa13

chore: incoporated PR feedback.

c4510c8

fix: conflicts.

d23ca89

Nilabhra added 2 commits December 5, 2022 13:33

add: Usage instructions.

2d65765

Merge branch 'argo-multistep'

679bbf1

Nilabhra added 2 commits December 7, 2022 09:44

chore: added Argo example to the main README.rst.

bc9d923

chore: updated documentation.

99386f6

zhanyuanucb approved these changes Dec 7, 2022

View reviewed changes

zhanyuanucb merged commit 7786414 into petuum:master Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Argo backend #19

Support for Argo backend #19

Nilabhra commented Nov 14, 2022 •

edited

Loading

zhanyuanucb commented Nov 29, 2022

zhanyuanucb Nov 29, 2022 •

edited

Loading

Nilabhra Nov 29, 2022

Nilabhra commented Nov 29, 2022 •

edited

Loading

zhanyuanucb commented Nov 29, 2022

Nilabhra commented Nov 30, 2022

Nilabhra commented Dec 5, 2022

zhanyuanucb commented Dec 6, 2022

Nilabhra commented Dec 6, 2022

zhanyuanucb commented Dec 6, 2022

Support for Argo backend #19

Support for Argo backend #19

Conversation

Nilabhra commented Nov 14, 2022 • edited Loading

Additions:

Changes:

Closes:

Comments:

Commit history:

zhanyuanucb commented Nov 29, 2022

zhanyuanucb Nov 29, 2022 • edited Loading

Choose a reason for hiding this comment

Nilabhra Nov 29, 2022

Choose a reason for hiding this comment

Nilabhra commented Nov 29, 2022 • edited Loading

zhanyuanucb commented Nov 29, 2022

Nilabhra commented Nov 30, 2022

Nilabhra commented Dec 5, 2022

zhanyuanucb commented Dec 6, 2022

Nilabhra commented Dec 6, 2022

zhanyuanucb commented Dec 6, 2022

Nilabhra commented Nov 14, 2022 •

edited

Loading

zhanyuanucb Nov 29, 2022 •

edited

Loading

Nilabhra commented Nov 29, 2022 •

edited

Loading