-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for legacy in-memory storage behavior. In memory storage - persists encoding for upload / download #69
Conversation
piiraa
commented
Sep 25, 2023
…ersists encoding for upload / download
62771e1
to
62300b5
Compare
Can we add a test that demonstrates the issue? Ideally one that fails before this change, and passes afterwards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general looks good, but can we add a spec for the behaviour?
Test RSpec.describe BucketStore, :integration do
# ...
context "inmemory storage" do
let(:bucket_store) { described_class.for("#{base_bucket_uri}/#{temp_filename}") }
let(:base_bucket_uri) { "inmemory://bucket" }
let(:temp_filename) { "temp-file" }
shared_examples "encoding persistance" do |encoding|
it "persists encoding between upload/download for #{encoding}", :aggregate_failures do
upload_data = String.new("some string", encoding: encoding)
bucket_store.upload!(upload_data)
download = bucket_store.download
download_data = download.fetch(:content)
expect(upload_data.encoding).to eq(encoding)
expect(download_data.encoding).to eq(encoding)
expect(upload_data).to eq(download_data)
end
end
include_examples "encoding persistance", Encoding::UTF_8
include_examples "encoding persistance", Encoding::ISO_8859_1
end
end v0.5.0 (before streaming) ➜ bucket-store git:(v0.5.0) ✗ bundle exec rspec --tag integration spec/bucket_store_integration_spec.rb
Run options: include {:integration=>true}
Randomized with seed 42711
..
Finished in 0.00298 seconds (files took 0.7375 seconds to load)
2 examples, 0 failures
Randomized with seed 42711 v0.6.0 (streaming) ➜ bucket-store git:(v0.6.0) bundle exec rspec --tag integration spec/bucket_store_integration_spec.rb
Run options: include {:integration=>true}
Randomized with seed 39485
.F
Failures:
1) BucketStore inmemory storage persists encoding between upload/download for ISO-8859-1
Failure/Error: expect(download_data.encoding).to eq(encoding)
expected: #<Encoding:ISO-8859-1>
got: #<Encoding:UTF-8>
(compared using ==)
Diff:
@@ -1 +1 @@
-#<Encoding:ISO-8859-1>
+#<Encoding:UTF-8>
Shared Example Group: "encoding persistance" called from ./spec/bucket_store_integration_spec.rb:48
# ./spec/bucket_store_integration_spec.rb:41:in `block (4 levels) in <top (required)>'
Finished in 0.01427 seconds (files took 0.8374 seconds to load)
2 examples, 1 failure
Failed examples:
rspec './spec/bucket_store_integration_spec.rb[1:1:2]' # BucketStore inmemory storage persists encoding between upload/download for ISO-8859-1
Randomized with seed 39485 Current PR ➜ bucket-store git:(file-streaming--encoding-fix) bundle exec rspec --tag integration spec/bucket_store_integration_spec.rb
Run options: include {:integration=>true}
Randomized with seed 53800
..
Finished in 0.00406 seconds (files took 0.81981 seconds to load)
2 examples, 0 failures
Randomized with seed 53800 cc @ivgiuliani |
upload_data = String.new("some string", encoding: encoding) | ||
bucket_store.upload!(upload_data) | ||
download = bucket_store.download | ||
download_data = download.fetch(:content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With let(:upload_data)
string encoding bleeds out into the next test and fails 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to the in memory adapter are fine, but we want to ensure that there's consistent behaviour between the different adapters. After this change the in-memory adapter will store the encoding but other adapters won't, which means that depending which adapter we're using, for the same input we'll get a different output. You can test this by moving the spec in the shared example group and test it against the other adapters (you can run integration tests with rspec --tag integrations
, there's instructions in the README).
Practically it means we need to do something equivalent to what we have here for all the other adapters (gcs, s3 and disk). I had a brief look at GCS and they allow specifying an encoding on upload, but we need to look into the rest of the adapters...
context "inmemory storage" do | ||
let(:bucket_store) { described_class.for("#{base_bucket_uri}/#{temp_filename}") } | ||
let(:base_bucket_uri) { "inmemory://bucket" } | ||
let(:temp_filename) { "temp-file" } | ||
|
||
shared_examples "encoding persistance" do |encoding| | ||
it "persists encoding between upload/download for #{encoding}", :aggregate_failures do | ||
upload_data = String.new("some string", encoding: encoding) | ||
bucket_store.upload!(upload_data) | ||
download = bucket_store.download | ||
download_data = download.fetch(:content) | ||
|
||
expect(upload_data.encoding).to eq(encoding) | ||
expect(download_data.encoding).to eq(encoding) | ||
expect(upload_data).to eq(download_data) | ||
expect(upload_data.object_id).to_not eq(download_data.object_id) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be part of the shared examples and not specific of the in-memory adapter as otherwise we can't guarantee that the adapters behave in the same way.
#create_file, content_encoding ? We could add something like BucketStore.for("inmemory://bucket/path/x", encoding: Encoding::ISO_8859_1).download
# or
BucketStore.for("inmemory://bucket/path/y").download(encoding: Encoding::ISO_8859_1) Because right now we can influence download encoding with buffer = StringIO.new do |io|
io.set_encoding(Encoding::ISO_8859_1)
BucketStore.for("inmemory://bucket/path/z").stream.download(file: io)
end But then again, we can use this example if we need other encodings Fixed issues from the implementation side 👍 |