You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are running cljdoc on the cheap. As such, it is memory-constrained.
I've just added s3 backup/restore functionality for our SQLite db using aws-api and noticed an OutOfMemory error when using :GetObject and :PutObject.
Reproduction
To make this easy for anyone to run, I'll reproduce with MinIO Object Store via docker.
(we are using Exoscale, but that is not relevant).
To launch a local MinIO server:
docker run -p 9000:9000 -p 9001:9001 --name minio \
-e "MINIO_ROOT_USER=foouser" \
-e "MINIO_ROOT_PASSWORD=foosecret" \
minio/minio server /data --console-address ":9001"
I wiped up a little script to demonstrate the issue (to be used with deps.edn above). objheap.clj
(nsobjheap
(:require [clojure.java.io :as io]
[cognitect.aws.client.api :as aws]
[cognitect.aws.credentials :as awscreds])
(:import (java.io RandomAccessFile)))
(defncreate-test-file [file-path size-in-mb]
(let [size-in-bytes (* size-in-mb 10241024)]
(with-open [f (RandomAccessFile. file-path "rw")]
(.setLength f size-in-bytes))))
(defn-main [& args]
(println (format"max heap %dmb" (/ (.maxMemory (Runtime/getRuntime)) 10241024)))
(let [opts (apply hash-map args)
file-mb (parse-long (get opts "file-mb""512"))
bucket "foobucket"
op (get opts "op""put") ;; put or get
s3 (aws/client {:api:s3;; need a valid aws region (even though we are not using aws) to overcome bug;; https://github.com/cognitect-labs/aws-api/issues/150:region"us-east-2":credentials-provider (awscreds/basic-credentials-provider
{:access-key-id"foouser":secret-access-key"foosecret"})
:endpoint-override {:protocol:http:hostname"127.0.0.1":port9000}})]
(aws/invoke s3
{:op:CreateBucket:request {:Bucket bucket}})
(case op
"put" (do (println (format"put %dmb file" file-mb))
(create-test-file"bigfile" file-mb)
(with-open [input-stream (io/input-stream"bigfile")]
(aws/invoke s3
{:op:PutObject:request {:Bucket bucket
:Key"bigfile":Body input-stream}})))
"get" (do
(println"get file")
(let [dest-file (io/file"bigfile.down")]
(.delete dest-file)
(-> (aws/invoke s3 {:op:GetObject:request {:Bucket bucket
:Key"bigfile"}})
:Body
(io/copy dest-file))
(println (format"Downloaded file: %.2fmb" (/ (.length dest-file) 10241024.0))))))))
(apply -main *command-line-args*)
Sanity runs
(Your max heap will differ)
Let's put a 1mb file:
$ clojure -M objheap.clj op put file-mb 1
max heap 8012mb
put 1mb file
And fetch it:
$ clojure -M objheap.clj op get
max heap 8012mb
get file
Downloaded file: 1.00mb
Ok now let's try to put a 1gb file:
$ clojure -M objheap.clj op put file-mb 1024
max heap 8012mb
put 1024mb file
And fetch it:
$ clojure -M objheap.clj op get
max heap 8012mb
get file
Downloaded file: 1024.00mb
All sane, all good.
Failing runs
Let's start by putting that 1gb file unconstrained (just in case you didn't execute the sanity runs):
$ clojure -M objheap.clj op put file-mb 1024
max heap 8012mb
put 1024mb file
And now let's try fetching the 1gb object constrained to 800mb:
$ clojure -J-Xmx800m -M --report stderr objheap.clj op get
max heap 800mb
get file
2024-09-19 11:26:30.696:INFO:oejc.ResponseNotifier:qtp1060161999-30: Exception while notifying listener org.eclipse.jetty.client.HttpRequest$10@46c28d6e
java.lang.OutOfMemoryError: Java heap space
at clojure.lang.Numbers.byte_array(Numbers.java:1425)
at cognitect.http_client$empty_bbuf.invokeStatic(http_client.clj:49)
at cognitect.http_client$empty_bbuf.invoke(http_client.clj:46)
at cognitect.http_client$on_headers.invokeStatic(http_client.clj:145)
at cognitect.http_client$on_headers.invoke(http_client.clj:131)
at clojure.lang.Atom.swap(Atom.java:51)
at clojure.core$swap_BANG_.invokeStatic(core.clj:2370)
at clojure.core$swap_BANG_.invoke(core.clj:2362)
at cognitect.http_client.Client$fn$reify__12664.onHeaders(http_client.clj:254)
at org.eclipse.jetty.client.HttpRequest$10.onHeaders(HttpRequest.java:530)
at org.eclipse.jetty.client.ResponseNotifier.notifyHeaders(ResponseNotifier.java:100)
at org.eclipse.jetty.client.ResponseNotifier.notifyHeaders(ResponseNotifier.java:92)
at org.eclipse.jetty.client.HttpReceiver.responseHeaders(HttpReceiver.java:296)
at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.headerComplete(HttpReceiverOverHTTP.java:319)
at org.eclipse.jetty.http.HttpParser.parseFields(HttpParser.java:1247)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1529)
at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.parse(HttpReceiverOverHTTP.java:208)
at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.process(HttpReceiverOverHTTP.java:148)
at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.receive(HttpReceiverOverHTTP.java:80)
at org.eclipse.jetty.client.http.HttpChannelOverHTTP.receive(HttpChannelOverHTTP.java:131)
at org.eclipse.jetty.client.http.HttpConnectionOverHTTP.onFillable(HttpConnectionOverHTTP.java:172)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:137)
at org.eclipse.jetty.io.ManagedSelector$$Lambda/0x000077f4f7a4f448.run(Unknown Source)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
at java.base/java.lang.Thread.runWith(Thread.java:1588)
{:clojure.main/message
"Execution error (IllegalArgumentException) at objheap/-main (objheap.clj:49).\nNo method in multimethod 'do-copy' for dispatch value: [nil java.io.File]\n",
:clojure.main/triage
{:clojure.error/class java.lang.IllegalArgumentException,
:clojure.error/line 49,
:clojure.error/cause
"No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]",
:clojure.error/symbol objheap/-main,
:clojure.error/source "objheap.clj",
:clojure.error/phase :execution},
:clojure.main/trace
{:via
[{:type clojure.lang.Compiler$CompilerException,
:message
"Syntax error macroexpanding at (/home/lee/proj/oss/-verify/aws-api-objects-on-heap/objheap.clj:52:1).",
:data
{:clojure.error/phase :execution,
:clojure.error/line 52,
:clojure.error/column 1,
:clojure.error/source
"/home/lee/proj/oss/-verify/aws-api-objects-on-heap/objheap.clj"},
:at [clojure.lang.Compiler load "Compiler.java" 8177]}
{:type java.lang.IllegalArgumentException,
:message
"No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]",
:at [clojure.lang.MultiFn getFn "MultiFn.java" 156]}],
:trace
[[clojure.lang.MultiFn getFn "MultiFn.java" 156]
[clojure.lang.MultiFn invoke "MultiFn.java" 238]
[clojure.java.io$copy invokeStatic "io.clj" 409]
[clojure.java.io$copy doInvoke "io.clj" 394]
[clojure.lang.RestFn invoke "RestFn.java" 428]
[objheap$_main invokeStatic "objheap.clj" 49]
[objheap$_main doInvoke "objheap.clj" 12]
[clojure.lang.RestFn applyTo "RestFn.java" 140]
[clojure.core$apply invokeStatic "core.clj" 667]
[clojure.core$apply invoke "core.clj" 662]
[objheap$eval12575 invokeStatic "objheap.clj" 52]
[objheap$eval12575 invoke "objheap.clj" 52]
[clojure.lang.Compiler eval "Compiler.java" 7700]
[clojure.lang.Compiler load "Compiler.java" 8165]
[clojure.lang.Compiler loadFile "Compiler.java" 8103]
[clojure.main$load_script invokeStatic "main.clj" 476]
[clojure.main$script_opt invokeStatic "main.clj" 536]
[clojure.main$script_opt invoke "main.clj" 531]
[clojure.main$main invokeStatic "main.clj" 665]
[clojure.main$main doInvoke "main.clj" 617]
[clojure.lang.RestFn applyTo "RestFn.java" 140]
[clojure.lang.Var applyTo "Var.java" 707]
[clojure.main main "main.java" 40]],
:cause
"No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]",
:phase :execution}}
Execution error (IllegalArgumentException) at objheap/-main (objheap.clj:49).
No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]
The 2nd error is caused by the first (OutOfMemory) error.
And now let's try putting a 1gb object constrained to 800mb:
The http client seems to be loading the entire object into memory for GetObject and PutObject.
When the object is big, and memory is limited, this can cause OutOfMemory errors.
Is there some way I can work around this?
The text was updated successfully, but these errors were encountered:
lread
added a commit
to cljdoc/cljdoc
that referenced
this issue
Sep 22, 2024
Bit of a bummer, but aws-api was copying entire object to heap, and we
don't have enough heap for that. Our database backup is close to 1gb.
Abstracted s3 to its own namespace and protocol to make swapping aws-api back
in the future. Maybe they'll fix the issue, or maybe I was just using it wrong.
See: cognitect-labs/aws-api#257
Thank you for Cognitect's aws-api! I'm just dipping my toe in, so my apologies if I'm making a newbie mistake or if this issue is already well-known.
Dependencies
deps.edn
Description
We are running cljdoc on the cheap. As such, it is memory-constrained.
I've just added s3 backup/restore functionality for our SQLite db using aws-api and noticed an OutOfMemory error when using
:GetObject
and:PutObject
.Reproduction
To make this easy for anyone to run, I'll reproduce with MinIO Object Store via docker.
(we are using Exoscale, but that is not relevant).
To launch a local MinIO server:
I wiped up a little script to demonstrate the issue (to be used with
deps.edn
above).objheap.clj
Sanity runs
(Your max heap will differ)
Let's put a 1mb file:
And fetch it:
Ok now let's try to put a 1gb file:
And fetch it:
All sane, all good.
Failing runs
Let's start by putting that 1gb file unconstrained (just in case you didn't execute the sanity runs):
And now let's try fetching the 1gb object constrained to 800mb:
The 2nd error is caused by the first (
OutOfMemory
) error.And now let's try putting a 1gb object constrained to 800mb:
Observation
The http client seems to be loading the entire object into memory for
GetObject
andPutObject
.When the object is big, and memory is limited, this can cause
OutOfMemory
errors.Is there some way I can work around this?
The text was updated successfully, but these errors were encountered: