-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upload size limit #43
Comments
The part about "if you don't set the limit explicitly, it is unlimited" - do you know offhand if anyone has actually tested it? As in, try to run it without the limit set, and upload, say, a 5GB file? |
@landreev when SWORD was new in DVN 3.x I remember trying a large file locally and it working but I can't remember exactly how many gigabytes. I want to say 4 or 5 GB. Obviously, this was a long time ago. I haven't done any recent testing. |
Aside of the limit not being associated with a Dataverse collection, there seems to be a misunderstanding. The reported maximum upload size is in kilobyte, not byte:
(See 6.1. Retrieving a Service Document) It's wrong in the code!
This means, max integer is capable of 2TB. |
Reporting the gist of our Slack tech discussion yesterday:
sequenceDiagram
actor up as ZIP Upload
participant servlet as dv.SWORDv2MediaResourceServlet
participant mapi as sw2s.MediaResourceApi
participant sae as sw2s.SwordApiEndpoint
participant mrmi as dv.MediaResourceManagerImpl
participant fu as dv.FileUtil
up ->> servlet: doPost()
servlet ->> mapi: post()
mapi ->> sae: addDepositPropertiesFromBinary()
sae ->> sae: storeAndCheckBinary()
note right of sae: Store ZIP upload as SWORD temp file
mapi ->> mrmi: addResource()
mrmi ->> mrmi: replaceOrAddFiles()
mrmi ->> mrmi: Identify target dataset
mrmi ->> fu: createDataFiles()
fu ->> fu: Retrieve target store size limit
fu ->> fu: Copy SWORD temp file to another temp file
fu ->> fu: Unzip
fu ->> fu: For each file check limit and store
|
@landreev @pdurbin is this still a thing for you? May I propose a rather simple change? public interface SwordConfiguration {
...
- int getMaxUploadSize();
+ long getMaxUploadSize();
...
} The comparison happens in bytes. To be able to use more than max int (2,147,483,647), we would be limited to max long (9,223,372,036,854,775,807) instead and have only minimal code changes and a simple workaround for now. |
@poikilotherm yes! I think this is exactly what we want. Obviously, we'll need to make the change on the Dataverse side as well. When we're ready, we should re-open this issue (or create a new one, no strong preference), and get it put into a sprint: |
For now this is a placeholder issue. Today @landreev @scolapasta and I were talking about how in practical terms, files are limited to 2 GB when uploading to Dataverse via SWORD because getMaxUploadSize returns an
int
.Below is the code from https://github.com/IQSS/dataverse/blob/v5.9/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/SwordConfigurationImpl.java#L123-L146
To summarize:
The text was updated successfully, but these errors were encountered: