Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential deadlock bug in peer.go due to a race condition #29857

Open
blakehartin opened this issue May 27, 2024 · 2 comments
Open

Potential deadlock bug in peer.go due to a race condition #29857

blakehartin opened this issue May 27, 2024 · 2 comments
Assignees
Labels

Comments

@blakehartin
Copy link

System information

Geth version: geth version
CL client & version: e.g. lighthouse/nimbus/[email protected]
OS & Version: Windows/Linux/OSX
Commit hash : (if develop)

https://github.com/ethereum/go-ethereum/blob/master/p2p/peer.go

Consider the following function in peer.go:

func (p *Peer) run()

The wg.wait() sync.WaitGroup waits on both readLoop and pingLoop to complete.

Consider the case where the loop in above function exits in a select-case other than "readErr". After this the execution pauses at p.wg.Wait(), waiting for readLoop to complete.

But inside readLoop, the sending will be waiting for receiver (but receiver has exited because of break LOOP)

errc <- err

In golang, this will cause the sender to get stuck forever till receiver picks it up and hence a deadlock.

To fix the issue, if the loop in run() exited anything other than readErr, then anything pending on the channel has to be received.

Otherwise, p.wg.Wait() will forever block.

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

Backtrace

[backtrace]

When submitting logs: please submit them as text and not screenshots.

@blakehartin
Copy link
Author

Looks like someone else reported same problem but was closed #17950

The same problem occurred on a non-Ethereum implementation and by adding logs, we were able to verify the problem.

@blakehartin
Copy link
Author

Based on below doc, in case the loop in run() exists in any other select case other than readErr, then wg.done() in readLoop will never get executed, because readLoop will be stuck on line errc <- err

https://go.dev/tour/concurrency/2
By default, sends and receives block until the other side is ready. This allows goroutines to synchronize without explicit locks or condition variables

`

@fjl fjl self-assigned this May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants