-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Functionallity for snapshot suggestion #185
base: main
Are you sure you want to change the base?
Conversation
To test my implementation of snapshoting, I added a function to suggest that the server performs a snapshot without needing to change the threshold to trigger snapshot to be taken. I don't know if this is actually useful or not and I also don't know if we're supposed to add stuff to the legacy-interface. I haven't written any tests for this either. Please comment on this whether or not this is something to continue working on. |
Do you need this only for testing? In the v1.x version of this Raft library, this functionality is not needed (in particular, the So I'm slightly hesitant to add this functionality. See https://raft.readthedocs.io/core.html for details of the v1.x API, whose core part is already in place, but unfortunately still lacks the libuv integration part. All that being said, if you just need this functionality for testing, perhaps you could for example manually set the threshold to 0 and then submit a barrier entry, to force triggering a snapshot. Or something along those lines. |
Yes, it's just for testing to verify that the state produced by the state machine is correct after snapshoting. I'll see if I can implement it like you suggest, without changing my high level test, although I'm a little confused about these ABI versions :) |
Yeah, that's unfortunate indeed. In a nutshell:
You can find some of the motivation of the new v1.x design here: |
Hi I find there's a lot of resistance when using the UV implementation for multiple reasons especially as the application I'm integrating raft into already has IO and event stuff in place. I will instead try using the v1.x ABI so this PR could be closed. I now nevertheless have a working test suite which is helpful when implementing the new ABI which would require a complete rewrite. I can see already now that there are some documentation issues with the ABI v1.x, like incorrect information in comments above for instance for |
What event loop are you using? My short-term goal would be to use libuv to create an I/O backend for the v1.x API, but later on I'd like to build a io_uring-based backend.
If you could find issues or open PRs for that, it'd be great, thanks. There is definitely work to do in order to better document the v1.x API. If you could point to me the parts where you are having troubles, that would be a good way for me to know where more information is needed. |
I am using libevent but also an abstraction layer on top of that. A solution without UV could have been to have a protocol layer on top of v1.x with which you can exchange network data to/from other nodes and also disk I/O data. You would then only implement the sockets, disk I/O and event parts yourself. There could be functions like As a start maybe the network packet (en/de)coding could be moved out of the uv framework and afterwards the raft logic. It would then be pretty straight forward to make implementations with any event loop or even without event loop. What are your thoughts on that?
I will certainly do that. |
I'm don't entirely understand what you mean, but it seems I was thinking something similar. My plan was roughly to:
I don't think it's entirely as straightforward as it seems, at least if we want to make everything pull-driven (i.e. no callbacks) and non-blocking. I haven't thought about this hard enough yet though, so if you want to elaborate more on your ideas, I'm happy to hear them and perhaps get some inspiration.
Thanks! |
I was thinking about the "pull" method as well and how this could be done, I've thought a bit more on how to implement that. I think the pulling you suggest is very straight forward! It could also be quite efficient as it allows for tight loops when performing misc tasks without long call trees. Let's hear your thoughts on this :) Let's say we had the following layers:
Whenever Bridge is called from IO, the structure The new Bridge layer exposes these functions:
The IO layer knows nothing about the contents of network packets or file data. It simply exchanges data blobs with the Bridge layer and will create any network connections and open files as needed. On the other hand, the Brige layer does not know anything about network connections or file system. I don't think the Receiving/sending network data from/to other serversAs a thought flow, whenever data is received on a network connection, the function
The Bridge layer need not worry about any network connections. It simply informs the IO layer about which server ID data in the tasks is destined for. Incoming data is also distinguished using server ID. The IO layer hence needs to know which connections it has for which server ID. This also does not need to be actual connections if another means of transport is used. If there are problems with sending outgoing network data, the IO layer just calls acknowledge like if the data was sent. The situation will be detected by other means by the Bridge layer. As an optimization, there could be a helper function in the bridge to check for which servers the IO layer needs to communicate with (e.g. which server is part of the cluster or is the leader). Getting IO tasks from the Bridge layerThe structure Memory for the tasks is owned and handled by the Bridge layer and it may be re-used or arena-allocated. The IO layer calls Reading filesThe Bridge layer provides the file names and data blobs to write to or read from in the Reading file dataIf the Bridge wants to read metadata or other files from disk, it will make tasks for this describing the file names. The function If there are concerns about copying a lot of data in memory, data exchange could be implemented as fetch/put callbacks set in the tasks structure. Both methods could be used as appropriate. Data could also be composed of chunks and point to different locations in memory to avoid duplication. ----I think this could be a very simple interface to use and, but of course it requires to move a lot of logic around (which would be required regardless of solution). When it comes to UV, this would be a built-in IO layer. It would then contain only low level stuff so that the existing raft logic can be re-used by other implementations. |
@atlesn thanks for the detailed write-up. Right now I don't have enough time to really think deep about everything you wrote, I'll probably do it in the next few days. But it seems we're on the same page on several design ideas, and I believe what I'd like to do implement is in some way close to what you're suggesting. I will look at this again as time permits. |
No description provided.