You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This might not be something we decide right now, but I'm creating an issue to keep track of the discussion.
In the API right now, we treat weight updates from the same client (whom we identify solely via their autogenerated websocket id) as completely independent. If a client sends 10 weight updates, each with numExamples=1, we assume they labeled 10 separate examples, used the same ModelFitConfig to modify the original weights based on those examples, and sent us the results. This should work and allow us to learn, but another option would be to have the client retrain the model each time (starting from the original weights) using all of the data they've labeled so far. In that case, they would send us successive updates with numExamples=1, then numExamples=2, and so on until numExamples=10, and when it comes time to average, we would only consider the latest update the client sent us.
Here are some thoughts about the strengths of each strategy:
Good things about keeping same-client updates totally independent:
We don't have to store old labeled examples on the client side; the client can throw away data as soon as it's been used to compute a weight update.
Computing weight updates might be a little faster, since the batch size will always be 1.
If we save old examples and a client disconnects and reconnects, it's possible they will either (a) keep the same client ID but lose the examples, or (b) keep the old examples but get assigned a new client ID. If either of these things happens, and we have superseding logic, the server might throw away updates for certain examples (under a) or double count them (under b). Having more persistent IDs might help address this, but it's nice to not have to worry about it.
Good things about letting same-client updates supersede each other:
It might lead to faster learning. The results from a very basic experiment suggest that, when clients take multiple local SGD steps, we learn slightly faster from 10 clients with 3 samples each than we do with 33 clients with 1 sample each. It's not clear how those results will generalize / how significant they are, but it makes intuitive sense that having only a single example would preclude us from taking more than a couple SGD steps. Taking multiple steps on the client is really important for learning faster overall. (Note that we could also experiment with federated averaging using different numbers of local SGD steps based on the number of examples!)
Learning might be comparatively even faster because of privacy concerns. If it turns out that sending an exact weight update for just one example lets us reconstruct the original input on the server, then the client might need to add a lot of extra noise to the weight update. However, if the client ran SGD with multiple examples, maybe they could get away with adding less noise to the new weight update, making it even more useful. Of course, now multiple files on the server will contain information about the same examples, so maybe we introduce a new privacy issue!
In cases where users are using the client app for a long time and actually care about improving the local versions of their model right now, it might be nice to use an updated version of the local model trained on all the data points. Computing weight updates for one example at a time wouldn't help with that, whereas recomputing the update for all the examples would give us the new model for free.
Either way:
We should compute and upload weight updates as soon as we can on the client side, whether they're for a single example or for many, since the user might leave the page at any time.
To prevent race conditions, we should never overwrite records or files on the server. If we have multiple updates from the same client that supersede each other, we just read all of them and take the latest, which is inherently safe / fails gracefully (worst that can happen is we miss an update).
Sorry if this got super long, but hopefully this will be useful for reference later.
The text was updated successfully, but these errors were encountered:
asross
changed the title
Handing multiple weight updates from same client (for same model)
Handing multiple weight updates from same client
Jun 12, 2018
This might not be something we decide right now, but I'm creating an issue to keep track of the discussion.
In the API right now, we treat weight updates from the same client (whom we identify solely via their autogenerated websocket id) as completely independent. If a client sends 10 weight updates, each with
numExamples=1
, we assume they labeled 10 separate examples, used the sameModelFitConfig
to modify the original weights based on those examples, and sent us the results. This should work and allow us to learn, but another option would be to have the client retrain the model each time (starting from the original weights) using all of the data they've labeled so far. In that case, they would send us successive updates withnumExamples=1
, thennumExamples=2
, and so on untilnumExamples=10
, and when it comes time to average, we would only consider the latest update the client sent us.Here are some thoughts about the strengths of each strategy:
Good things about keeping same-client updates totally independent:
Good things about letting same-client updates supersede each other:
Either way:
Sorry if this got super long, but hopefully this will be useful for reference later.
The text was updated successfully, but these errors were encountered: