This repo explains how to integrate a federated learning framework into UniFed as a CoLink protocol. You can follow the steps below to add new frameworks under UniFed.
git clone [email protected]:CoLearn-Dev/colink-unifed-example.git
You should at least update the following 3 places:
setup.py
: Update the value ofFRAMEWORK_NAME
. All letters should be lowercase.- Under
./src/unifed/frameworks/
: Rename the folderexample_framework
intoFRAMEWORK_NAME
. colink.toml
:- Replace
unifed-example
withunifed-<FRAMEWORK_NAME>
. - Replace all
colink-protocol-unifed-example
withcolink-protocol-unifed-<FRAMEWORK_NAME>
. - (Optional) Update the value of
description
to briefly describe your framework.
- Replace
Make sure you have Python 3.7+ installed.
pip install -e .
- You should look into the file
./src/unifed/frameworks/<FRAMEWORK_NAME>/protocol.py
. It's recommended that you first look through the code and comments to get a general idea of how it works. workload_sim.py
provides an expectation for how the external workload should look like. You can use it as a reference to compare with the framework that you are working with.- To try out the workload, in the root directory of the repo, run
unifed-example-workload client 1 ./log/1o.txt ./log/1l.txt
- You should also package the external workload and specify its instruction command in
colink.toml
underinstall_script
.
- The first step is to write a test configuration. You can look into
./test/configs/case_0.json
for an example. Note that for the case you construct, it should mainly serve the purpose of correctness testing (e.g. 1~2 epochs with a small model is usually sufficient). In this way, we can reproduce the correctness testing from a single host. - Next, read
./test/test_all_config.py
to understand how to run the test.- To assert no error occurs when running certain configuration cases, in the root directory of the repo, run
pytest # note that you need to install pytest for this, via `pip install pytest`
- To check the output for running certain cases, change the case string
target_case = "test/configs/case_0.json"
intest_all_config.py
(note that this only works when you install with-e
flag), then, in the root directory of the repo, run
python test/test_all_config.py
- You can mark out the cases that are under development by adding
skip
in the name of the config (e.g.skip_case_x.json
, so that pytest will skip those). You can still check the output of those cases by runningpython test/test_all_config.py
directly.
Recommended workflow
- fork the repo
- get familiar with the example
- update the metadata
- get familiar with and the external framework and package it
- write one simple test case
- write protocol to connect the interface
- test the simple case, use
test_all_config.py
for testing during dev - iterate and add more test cases
Hints