-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
P2P state exchange #2373
Comments
With p2p state exchange, we don't even need full storage when persist block. We can get the key-value we need via p2p when executing transaction script if the speed acceptable. I like this idea. I mentioned something similar before here #1526 (comment). |
Well, there are problems with this:
So I'd rather do full state synchronization first.
That can easily be solved with requirement to store some "checkpoint" states every N (say, 50K) blocks, so that at current height X the node will still store two previous checkpoints even if it deletes old state data otherwise. So a new node joining the network would try to synchronize with the latest checkpoint and it would have at least N blocks of time to do that. At the same time from what I see in recent discussions state roots we have in Neo 3 are just not usable for this purpose, so we either need to adapt them for it, or use some different approach, or not bother with state exchange at all. |
I'll outline a scheme here that can work with in-header stateroots (and we'll implement it as an extension in NeoGo), although I'm not sure how to adapt it to SV-signed state data. This scheme implies that nodes always compute stateroot hash for their data and always check it against block headers. So we want to synchronize state over the P2P protocol. This means that we should receive enough data to process new blocks and run transactions. But actual node state is a moving target because it changes with every block and regular Neo node only keeps the latest state in its DB. It also has MPT state data, but it can delete old trees with KeepOnlyLatestState option (thus keeping only the latest one). It at the same time can delete old (untraceable) blocks (RemoveUntraceableBlocks). Implementing synchronization using KV DB directly (requesting paths/ranges) can be more network-efficient, but it requires either keeping an old KV DB along with the new one (which is not space-efficient and complicates logic) or creating a snapshot like in #1286 (which adds some load on CNs and needs some external storage or P2P propagation mechanism). Therefore it's considered to be impractical. We have the same data in MPT though and by design it can easily store multiple versions of data (corresponding to different blocks). It at the same time is self-proving against known stateroot, so we'll use that as the basis for state exchange. Synchronization can take some time, so consistent state data corresponding to some block has to be available on the network even though the network advances as new blocks get added. Some heights are declared special for this reason and known as state synchronization heights, nodes keep available state data in form of MPT for these heights for some time so that other nodes can get it. Protocol changesProposed approach tries to make minimal changes to the protocol as well as node's logic. Two new P2P messages are added:
An additional protocol configuration option is also added, OperationAll nodes maintain local MPT and calculate local state root hash. Some keep all stateroots, while some ( New node that wants to perform synchronization fetches all block headers first receiving signed stateroot data with the headers. It then creates a list of currently unknown MPT hashes, adds P1 stateroot hash in it and starts synchronizing state. On every iteration of this process it takes currently unknown hashes and requests them from its peers with If the node is interrupted during the process it must walk the local on-disk trie, rebuild the list of missing nodes and then start the process again. It then (technically, it can be done in parallel to state sync) fetches |
Summary or problem description
The need for P2P state exchange has been noted several times already, I'll just quickly recap the most important points:
MaxTraceableBlocks
) blocks can be dropped, making block-by-block synchronization less availableIn order to bootstrap new full node we need to make its storage synchronized with other full nodes and if we're not processing the whole chain this includes:
MaxTraceableBlocks
number of last blocks (already possible, but hi, Lower MaxValidUntilBlockIncrement limit #2026)The most problematic is the second item that I'd like to discuss.
Do you have any solution you want to propose?
It's relatively easy to create a protocol for MPT traversal that will allow nodes to request/receive tree nodes and allow to synchronize this way. There also is a proposal to do this exchange via full state snapshots (#1286), but MPT traversal seems to be simpler than that if we're to suppose that every node has it. Unfortunately, that's not the case for C# implementation where nodes can have some state, but not have state root hash calculated over it and verified with SV signatures.
This can be solved with another node capability ("MPTNode", for example) so that the other nodes would know who can provide state data via P2P (the same is true if we're to go snapshot route), but one problem still remains and I don't see any clear solution for it yet.
The problem of state trust
neo-go has
StateRootInHeader
option that is a protocol extension (hi, #1526) allowing to put state root hash into the block header. This means that any node receiving this header also receives verified hash of chain state as of previous block, so if a new node processes headers from the genesis it reliably knows what state to request from its peers (be it MPT traversal or snapshot) and can verify data against this hash.Absent this option we have state validators that are defined as a role in native contract. They sign state root messages defined as
[version, index, hash]
. These messages can only be verified if the node already has proper state, that is its RoleManagement native contract's storage contains keys of state validators. But we're talking about nodes that want to synchronize state via P2P, so they don't yet have it.And they can't reliably get this data. The node that doesn't process the chain from the genesis only has protocol configuration and headers, neither configuration, nor headers contain SV keys. So we can't have P2P state exchange until we solve this problem.
There are several potential solutions to it:
PrevHash
andNextValidators
fields into state root messages making them form a proper secondary chain that would need to be synchronized along with header chain (further complicating configuration and synchronization)The choice is rather obvious for me, but I'm not sure any of these options are really welcome as solutions. So maybe there is something else we can do? Or maybe going one of these routes is fine? Maybe we don't need P2P state exchange at all? I'm really interested in opinions on this.
Other considerations
Note that although this is related to #2346, it's not a substitute for it. Archiving is still important even if P2P state exchange is implemented.
Neo Version
Where in the software does this update applies to?
The text was updated successfully, but these errors were encountered: