Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chain: use batch rpc client in mempool poller #888

Merged
merged 9 commits into from
Oct 12, 2023

Conversation

yyforyongyu
Copy link
Collaborator

This PR replaces the normal client with a batch client created from rpcclient.NewBatch. All the getrawtransaction requests are now batched with a default size of 10,000, and the batch will wait for 1 second before attempting the next one.

This change is made so less pressure is placed on bitcoind as when the mempool is full, there will be loads of getrawtransaction calls made, resulting in a slow response for other RPC calls such as getblockchaininfo, causing issues in other systems.

The mod files are also updated and a similar check is added here(copied from lnd)

Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass, looks really good! Hope this'll quite significantly reduce the load on bitcoind.

chain/mempool.go Outdated
@@ -347,12 +376,21 @@ func (m *mempool) UpdateMempoolTxes(txids []*chainhash.Hash) []*wire.MsgTx {
// Set all mempool txs to false.
m.UnmarkAll()

// TODO(yy): let the OS manage the number of goroutines?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it definitely makes sense to limit the number of goroutines here, especially if they end up creating RPC requests. Even if the requests are async and a call here would just create an entry in the queue, it probably still makes sense to limit the number of goroutines to prevent clogging on slower systems.
Or what's the reason for the TODO? Do we feel like this number isn't high enough currently?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RPC requests are already processed sequentially in btcd tho, so it won't have any effect here on that front. Unless we are concerning about the overhead of creating and destroying goroutines, I think it's better to let the runtime scheduler to handle them. Also GOMAXPROCS is already default to runtime.NumCPU

it probably still makes sense to limit the number of goroutines to prevent clogging on slower systems.

Have we experienced this before?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we experienced this before?

Yes, at least I think that in certain situations (e.g. during itests) when we ran too many things in parallel, some goroutines would just be starved (didn't get any CPU time). Not sure if that gets worse with the number of goroutines or just with the number of processes running on a system.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean when using make itest-parallel? I guess it's bypassing go's scheduler since we are parallelizing it in shell?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. So when the CPU is already struggling, goroutines can get starved (at least it looked like it). But I'm not sure if that gets worse with the number of goroutines there are in total or whether that doesn't have any direct impact. Just some thoughts/observations to possibly help the decision here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oli is right. Since goroutines are scheduled like anything else, it's possible to have them starve if there are too many context switches going on. I'm not sure if that applies here if the # of goroutines is minimal.
For more info about goroutine scheduling, see the goroutine section here: https://dave.cheney.net/2015/08/08/performance-without-the-event-loop

chain/mempool.go Outdated Show resolved Hide resolved
chain/mempool.go Outdated Show resolved Hide resolved
chain/mempool_test.go Outdated Show resolved Hide resolved
@@ -7,4 +7,13 @@ require (
github.com/btcsuite/btcd/btcutil v1.1.1
github.com/btcsuite/btcwallet/wallet/txrules v1.2.0
github.com/btcsuite/btcwallet/wallet/txsizes v1.2.3
github.com/fsnotify/fsnotify v1.5.4 // indirect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to push new tags for all those submodules after merging the PR.

@yyforyongyu yyforyongyu force-pushed the batch-client-mempool branch 2 times, most recently from a521070 to 48cdb29 Compare September 21, 2023 13:41
@guggero guggero self-requested a review September 21, 2023 15:21
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, looks very good!
Just a bunch of questions/comments.

chain/mempool.go Outdated Show resolved Hide resolved
var eg errgroup.Group
eg.SetLimit(runtime.NumCPU())
// newTxids stores a list of unseen txids found in the mempool.
newTxids := make([]*chainhash.Hash, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use var newTxids []*chainhash.Hash if we're not pre-allocating with a non-zero value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm feel like I've had this conversation with laolu before but now I cannot recall the details😂 But why is this the preferred way again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's just visually more apparent that there is no pre-allocation/initialization. It's easier to miss the 0 IMO and the make() would subconsciously register as "oh, we're allocating something here" when scanning the code quickly.
But it's really just nit-picking at this point. But I guess it's nice if all code uses the same pattern.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool yeah let's stick to this pattern then

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, also, I guess when it comes to unit tests and deep equality tests, there is a difference between var newTxids []*chainhash.Hash and newTxids := make([]*chainhash.Hash, 0): The first will give you a nil slice, while the second will give you an empty []*chainhash.Hash{} slice. So if that was a "decode" function, we'd get a diff on such a field from before encoding and after, if there were no elements present.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm so I had to revert the change to fix the linter,

Error: chain/mempool.go:411:2: Consider preallocating `newTxids` (prealloc)
	var newTxids []*chainhash.Hash

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, I see that the linter isn't very helpful here... Can we make any guess at how many new TXIDs we might get, so we don't have to pre-allocate with 0 (which silences the linter but doesn't really address the linter's underlying concern). Maybe something like len(newTxids) - len(m.txs) (with proper mutex lock and underflow checks of course). Not sure if that makes a big difference, even if this is a quite performance sensitive part of the code. So feel free to ignore.

chain/mempool.go Show resolved Hide resolved
chain/mempool.go Outdated Show resolved Hide resolved
chain/mempool_test.go Outdated Show resolved Hide resolved
@@ -22,8 +22,6 @@ github.com/btcsuite/btclog v0.0.0-20170628155309-84c8d2346e9f/go.mod h1:TdznJufo
github.com/btcsuite/btcutil v0.0.0-20190425235716-9e5f4b9a998d/go.mod h1:+5NJ2+qvTyV9exUAL/rxXi3DcLg2Ts+ymUAY5y4NvMg=
github.com/btcsuite/btcwallet/wallet/txrules v1.2.0 h1:BtEN5Empw62/RVnZ0VcJaVtVlBijnLlJY+dwjAye2Bg=
github.com/btcsuite/btcwallet/wallet/txrules v1.2.0/go.mod h1:AtkqiL7ccKWxuLYtZm8Bu8G6q82w4yIZdgq6riy60z0=
github.com/btcsuite/btcwallet/wallet/txsizes v1.2.2 h1:vtfutRuUoNsdgmKgYShuUgmBawNeC+lWu9jeucWkAII=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What version of Go did you run this on? I'm getting a diff in wallet/{txauthor,txrules,txsizes} with Go 1.18 or later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mine is go version go1.20.5 darwin/arm64. What are the diffs here? I think maybe it's due to my local go work setup, that it also makes sure this repo is in sync with lnd.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the diff I get when running go mod tidy:
diff.txt

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah very weird, for instance this line

github.com/golang/protobuf v1.5.2/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=

looks like this is in sync with lnd's mod which also has this version.

@yyforyongyu yyforyongyu force-pushed the batch-client-mempool branch 2 times, most recently from 177b056 to 93ab1f8 Compare September 26, 2023 16:01
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending resolution of the go mod tidy diff.

@@ -22,8 +22,6 @@ github.com/btcsuite/btclog v0.0.0-20170628155309-84c8d2346e9f/go.mod h1:TdznJufo
github.com/btcsuite/btcutil v0.0.0-20190425235716-9e5f4b9a998d/go.mod h1:+5NJ2+qvTyV9exUAL/rxXi3DcLg2Ts+ymUAY5y4NvMg=
github.com/btcsuite/btcwallet/wallet/txrules v1.2.0 h1:BtEN5Empw62/RVnZ0VcJaVtVlBijnLlJY+dwjAye2Bg=
github.com/btcsuite/btcwallet/wallet/txrules v1.2.0/go.mod h1:AtkqiL7ccKWxuLYtZm8Bu8G6q82w4yIZdgq6riy60z0=
github.com/btcsuite/btcwallet/wallet/txsizes v1.2.2 h1:vtfutRuUoNsdgmKgYShuUgmBawNeC+lWu9jeucWkAII=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the diff I get when running go mod tidy:
diff.txt

@Roasbeef
Copy link
Member

cc @ellemouton

@saubyk
Copy link

saubyk commented Oct 5, 2023

cc @Crypt-iQ for review. thanks

@Crypt-iQ
Copy link
Contributor

Crypt-iQ commented Oct 5, 2023

Needs a rebase since the gettxspendingprevout change landed

This commit adds a quit signal such that mempool can be torn down
quickly once the upper layer decides to shut down.
@yyforyongyu yyforyongyu force-pushed the batch-client-mempool branch 3 times, most recently from d0a835b to fc64ebc Compare October 9, 2023 11:40
chain/mempool.go Outdated

// getRawTxBatchSize specifies the number of requests to be batched
// before sending them to the bitcoind client.
getRawTxBatchSize = 10_000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be lowered. The maximum http response body that bitcoind will send back is 33,554,432 bytes. It's unlikely but possible that 10,000 transactions each ~3.3K bytes are requested and cause the batch to fail. Additionally, I think this might cause other goroutines calling bitcoind to have to wait a long time for the batch to finish

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decreased by 10x. Not sure if the 32MB body size limit also applies for batch rpc calls?

Additionally, I think this might cause other goroutines calling bitcoind to have to wait a long time for the batch to finish

Probably. I assume the lookup will be fast in bitcoind, sth like log(N) if indexed. And because we are sending 10000x fewer requests here, it will actually make bitcoind faster to respond because of the overhead saved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any gain would come from less HTTP headers. bitcoind will still call the same function N times and still acquire the main cs_main lock N times. I'd be curious to benchmark this in a reproducible way and see where any gain comes from, but not sure how feasible that is.

The 32MB body size applies for all incoming HTTP requests to bitcoind, it's set here: https://github.com/bitcoin/bitcoin/blob/744157ef1a0b61ceb714cc27c9ae158907aecdc9/src/httpserver.cpp#L457

chain/mempool.go Outdated
@@ -347,12 +376,21 @@ func (m *mempool) UpdateMempoolTxes(txids []*chainhash.Hash) []*wire.MsgTx {
// Set all mempool txs to false.
m.UnmarkAll()

// TODO(yy): let the OS manage the number of goroutines?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oli is right. Since goroutines are scheduled like anything else, it's possible to have them starve if there are too many context switches going on. I'm not sure if that applies here if the # of goroutines is minimal.
For more info about goroutine scheduling, see the goroutine section here: https://dave.cheney.net/2015/08/08/performance-without-the-event-loop

@@ -135,13 +135,12 @@ func newBitcoindZMQEvents(cfg *ZMQConfig,

zmqEvents := &bitcoindZMQEvents{
cfg: cfg,
client: client,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client needs to be set here so that gettxspendingprevout can be called

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added back


// NewBitcoindEventSubscriber initialises a new BitcoinEvents object impl
// depending on the config passed.
func NewBitcoindEventSubscriber(cfg *BitcoindConfig,
client *rpcclient.Client) (BitcoindEvents, error) {
client, batchClient *rpcclient.Client) (BitcoindEvents, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename batchClient to not match the interface name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if cfg.RPCBatchInterval == 0 {
mCfg.batchWaitInterval = DefaultBatchWaitInterval
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getRawTxBatchSize & batchWaitInterval members of mCfg don't get set if a non-default value is used

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops fixed


if cfg.RPCBatchInterval == 0 {
mCfg.batchWaitInterval = DefaultBatchWaitInterval
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in the RPC case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@yyforyongyu yyforyongyu force-pushed the batch-client-mempool branch 2 times, most recently from cf3f51d to 6d01043 Compare October 11, 2023 07:53
Copy link
Contributor

@Crypt-iQ Crypt-iQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides nit


// NewBitcoindEventSubscriber initialises a new BitcoinEvents object impl
// depending on the config passed.
func NewBitcoindEventSubscriber(cfg *BitcoindConfig,
client *rpcclient.Client) (BitcoindEvents, error) {
client, bClient *rpcclient.Client) (BitcoindEvents, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think client & bClient should be defined with their interfaces rather than both being *rpcclient.Client which is a little confusing. Alternatively, bClient batchClient works.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the method signature - yeah I also think it's a bit confusing and prone to errors, and think this needs to be changed in btcd so we can have a rpcclient.BatchClient.

@@ -69,7 +77,7 @@ var _ BitcoindEvents = (*bitcoindRPCPollingEvents)(nil)
// newBitcoindRPCPollingEvents instantiates a new bitcoindRPCPollingEvents
// object.
func newBitcoindRPCPollingEvents(cfg *PollingConfig,
client *rpcclient.Client) *bitcoindRPCPollingEvents {
client, bClient *rpcclient.Client) *bitcoindRPCPollingEvents {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same thing with the type of bClient

This commit replaces the normal client with a batch client created from
`rpcclient.NewBatch`. All the `getrawtransaction` requests are now
batched with a default size of 10,000, and the batch will wait for 1
second before attempting the next one.

This change is made so less pressure is made on bitcoind as when the
mempool is full, there will be loads of `getrawtransaction` calls made,
resulting in a slow response for other RPC calls such as
`getblockchaininfo`, causing issues in other systems.
Running the following commands,
```
go work sync
make tidy-module
```
This commit modifies the method `getRawTxIgnoreErr` to log more useful
info. It also removes `*mempool` as the method reciever as it's
unnecessary to put it on mempool.
@guggero guggero merged commit afbabbe into btcsuite:master Oct 12, 2023
3 checks passed
@yyforyongyu yyforyongyu deleted the batch-client-mempool branch October 12, 2023 15:31
buck54321 pushed a commit to buck54321/btcwallet that referenced this pull request Apr 21, 2024
chain: use batch rpc client in mempool poller
buck54321 pushed a commit to buck54321/btcwallet that referenced this pull request Apr 21, 2024
chain: use batch rpc client in mempool poller
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants