Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RED-196 check trie hashes for near node down check #236

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open

Conversation

magonjr
Copy link
Contributor

@magonjr magonjr commented Aug 12, 2024

No description provided.

Copy link

PR Reviewer Guide 🔍

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Key issues to review

Complexity and Error Handling
The new implementation in isDownCheck function is significantly complex and introduces multiple nested conditions and loops. This could lead to maintenance challenges and potential bugs. Additionally, the error handling for null checks and exceptions could be improved to prevent runtime errors.

Possible Bug
The variable nearNodeAsk is used to make a request using nearNode, but in the subsequent request for targetNodeAsk, nearNode is used instead of node. This seems like a potential bug where the wrong node might be queried.

@magonjr magonjr requested a review from thantsintoe August 12, 2024 15:51
Copy link
Contributor

@thantsintoe thantsintoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let us know how these changes are properly tested.

let hashTrieReq: HashTrieReq = {
radixList: ['0'],
}
let closestNodes = stateManager.getClosestNodes(hash, 5, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know if these 5 nodes will cover the radix 0 ?


if (nearNodeResponse == null) {
/* prettier-ignore */ nestedCountersInstance.countEvent('p2p', 'isDownCheck-down-near-node-fail', 1)
return 'down'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean "near node" is down or "target node" is down? If we cannot get response from near node, isn't that we report it as "lost" and try another near node?

src/p2p/Lost.ts Show resolved Hide resolved
src/p2p/Lost.ts Show resolved Hide resolved
}
//compare the hashes from the near node and the target node
for (let nodeHash of targetNodeResponse.nodeHashes) {
if (nodeHashesMap.get(nodeHash.radix) !== nodeHash.hash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the target node is up and responsive but has out of sync data? In that case, we should not mark it as down as the node may repair the radix in next cycle.

if (config.p2p.useNearNodeForDownCheck) {
//using the 'get_trie_hashes' route to check if the node is up.
let obj = { counter: currentCycle, checker: Self.id, target: node.id, timestamp: shardusGetTime() }
let hash = crypto.hash(obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this hash used instead of a radix covered by target node?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants