-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GODRIVER-2388 Improved Bulk Write API. #1884
base: master
Are you sure you want to change the base?
Conversation
API Change Report./v2/mongocompatible changes(*Client).BulkWrite: added ./v2/mongo/optionsincompatible changes(*DistinctOptionsBuilder).SetHint: removed compatible changesClientBulkWrite: added ./v2/x/mongo/driverincompatible changes(*Batches).AdvanceBatch: removed compatible changes(*Batches).AdvanceBatches: added ./v2/x/mongo/driver/operationincompatible changes(*Distinct).Hint: removed ./v2/x/mongo/driver/sessionincompatible changesClient.RetryRead: removed ./v2/x/mongo/driver/wiremessagecompatible changesDocumentSequenceToArray: added |
@@ -398,7 +398,7 @@ func TestClientSideEncryptionCustomCrypt(t *testing.T) { | |||
"expected 0 calls to DecryptExplicit, got %v", cc.numDecryptExplicitCalls) | |||
assert.Equal(mt, cc.numCloseCalls, 0, | |||
"expected 0 calls to Close, got %v", cc.numCloseCalls) | |||
assert.Equal(mt, cc.numBypassAutoEncryptionCalls, 2, | |||
assert.Equal(mt, cc.numBypassAutoEncryptionCalls, 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only call it once after the operation.go refactoring.
// A top-level error that occurred when attempting to communicate with the server | ||
// or execute the bulk write. This value may not be populated if the exception was | ||
// thrown due to errors occurring on individual writes. | ||
TopLevelError *WriteError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot use Error as a field name because of the conflict with the conventional method name.
} | ||
|
||
// AppendInsertOne appends ClientInsertOneModels. | ||
func (m *ClientWriteModels) AppendInsertOne(database, collection string, models ...*ClientInsertOneModel) *ClientWriteModels { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest abstracting the Append* methods:
type clientBulkWriteModel interface {
ClientInsertOneModel
}
// appendModels is a helper function to append models to ClientWriteModels.
func appendModels[T clientBulkWriteModel](m *ClientWriteModels, database, collection string, models []*T) *ClientWriteModels {
if m == nil {
m = &ClientWriteModels{}
}
for _, model := range models {
m.models = append(m.models, clientWriteModel{
namespace: fmt.Sprintf("%s.%s", database, collection),
model: model,
})
}
return m
}
} | ||
type clientWriteModel struct { | ||
namespace string | ||
model interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add stronger type constraints to this?
type clientBulkWriteModel interface {
ClientInsertOneModel // etc.
}
type clientWriteModel struct {
namespace string
model clientBulkWriteModel
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need an additional abstraction for an un-exported struct.
} | ||
|
||
// Error implements the error interface. | ||
func (bwe ClientBulkWriteException) Error() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't return an error if the write is unacknowledged. The specifications required that users be able to discern whether a BulkWriteResult
contains acknowledged results. Either return an error indicating an unacknowledged result, or update ClientBulkWriteResult
in the spirit of GODRIVER-2821.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can test this with the following:
package main
import (
"context"
"go.mongodb.org/mongo-driver/v2/bson"
"go.mongodb.org/mongo-driver/v2/mongo"
"go.mongodb.org/mongo-driver/v2/mongo/options"
"go.mongodb.org/mongo-driver/v2/mongo/writeconcern"
)
func main() {
client, err := mongo.Connect()
if err != nil {
panic(err)
}
defer func() { _ = client.Disconnect(context.Background()) }()
pairs := &mongo.ClientWriteModels{}
insertOneModel := mongo.NewClientInsertOneModel().SetDocument(bson.D{{"x", 1}})
opts := options.ClientBulkWrite().SetWriteConcern(writeconcern.Unacknowledged()).SetOrdered(false)
pairs = pairs.AppendInsertOne("db", "k", insertOneModel)
_, err = client.BulkWrite(context.Background(), pairs, opts) // Should not panic
if err != nil {
panic(err)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a unified spec test that covers this case? If not we should add one / add an integration test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if filter == nil { | ||
return nil, fmt.Errorf("%w: filter is required", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update the error message when the filter is not given.
if doc.filter == nil { | ||
return nil, fmt.Errorf("%w: filter is required", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update the error message when the filter is not given.
4bc724e
to
dbd44c9
Compare
dbd44c9
to
d10eff2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qingyang-hu There are still outstanding issues from the previous review.
@@ -13,6 +13,55 @@ import ( | |||
"go.mongodb.org/mongo-driver/v2/x/mongo/driver/operation" | |||
) | |||
|
|||
// ClientBulkWriteResult is the result type returned by a client-level BulkWrite operation. | |||
type ClientBulkWriteResult struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specifications say that "Users MUST be able to discern whether a [result] contains verbose results without inspecting the value provided for verboseResults
in [options]". Does this mean we should add a boolean value ClientBulkWriteResult
: HasVerboseResults
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial thought was to leave results maps as nil
when verboseResults
is false. However, I think you are right that an additional HasVerboseResults
field is more obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, either solution sounds good to me.
}, | ||
} | ||
var n int | ||
n, _, err = batches.AppendBatchSequence(nil, 4, 16_000, 16_000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the significance of 16_000
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a number big enough not to cut the document, so only the maxCount
regulates the output. Will add a comment there.
var idx int32 | ||
dst = wiremessage.AppendMsgSectionType(dst, wiremessage.DocumentSequence) | ||
idx, dst = bsoncore.ReserveLength(dst) | ||
dst = append(dst, identifier...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the specifications:
The first entry in each document has the name of the operation as its key and the index ini the
nsInfo
array of the namespace on which the operation should be performed as its value
When I do command monitoring for client bulk write with multiple pairs, I get the following:
2024/11/18 14:39:47 started: &{Command:{"bulkWrite": {"$numberInt":"1"},"errorsOnly": false,"ordered": true,"lsid": {"id": {"$binary":{"base64":"XTDtLVGhTx6MEIcFDhf0qw==","subType":"04"}}},"txnNumber": {"$numberLong":"1"},"$clusterTime": {"clusterTime": {"$timestamp":{"t":1731965987,"i":1}},"signature": {"hash": {"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"00"}},"keyId": {"$numberLong":"0"}}},"$db": "admin","ops": [{"insert": {"$numberInt":"0"},"document": {"_id": {"$oid":"673bb423a86efe4126c4c585"},"x": {"$numberInt":"1"}}},{"insert": {"$numberInt":"0"},"document": {"_id": {"$oid":"673bb423a86efe4126c4c586"},"x": {"$numberInt":"2"}}}],"nsInfo": [{"ns": "db.coll"}]} DatabaseName:admin CommandName:bulkWrite RequestID:1 ConnectionID:localhost:27017[-4] ServerConnectionID:0x14000390250 ServiceID:<nil>}
Where the index value for each document in the sequence is {"$numberInt":"0"}
. Shouldn't this be {"$numberInt":"0"}
, then {"$numberInt":"1"}
, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same thing occurs with the other client bulk write operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean the Int32 value of the operation name such as "insert", "update", or "delete"?
...
"ops":[
{
"insert":{
"$numberInt":"0"
},
...
It is "the index in the nsInfo
array of the namespace on which the operation should be performed as its value".
The specs also require:
When constructing the
nsInfo
array for abulkWrite
batch, drivers MUST only include the namespaces that are referenced in theops
array for that batch.
and:
Drivers MUST NOT include duplicate namespaces in this list.
Therefore, if both operations perform on the same namespace, the nsInfo
array should contain only one item, and both operation indices are 0, pointing to "db.coll".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the explanation!
GODRIVER-2388
GODRIVER-3348
GODRIVER-3349
GODRIVER-3364
Summary
Improved Bulk Write API.
Background & Motivation
Refactor the
(Operation).createWireMessage()
to support the bulk write batching.