Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupt filename in Upload component when server's file encoding is not UTF-8 #20417

Open
mas4ivv opened this issue Nov 6, 2024 · 4 comments
Assignees
Labels
BFP Bugfix priority, also known as Warranty bug Impact: Low Severity: Major

Comments

@mas4ivv
Copy link

mas4ivv commented Nov 6, 2024

Description of the bug

When using the Upload component with a filename that has non-ASCII characters, the server receives a corrupt filename if it's file encoding is not UTF-8.
The filename is part of the upload request and seems to be UTF-8 encoded:
grafik

The filename is then read via Apache Commons' ServletFileUpload, created by StreamReceiverHandler.getItemIterator(VaadinRequest). As there is no encoding set on creation, it falls back to the server's default.

Example running the example below from Windows command line:
grafik

Expected behavior

It seems to me that the client is always sending the filename UTF-8 encoded. If this is correct, the ServletFileUpload created in StreamReceiverHandler should be configured to use UTF-8 by calling setHeaderEncoding("UTF-8").

Minimal reproducible example

import java.io.ByteArrayOutputStream;

import com.vaadin.flow.component.orderedlayout.VerticalLayout;
import com.vaadin.flow.component.textfield.TextField;
import com.vaadin.flow.component.upload.Upload;
import com.vaadin.flow.router.Route;

@Route("upload")
public class UploadPage extends VerticalLayout {

    public UploadPage() {
        Upload upload = new Upload();
        TextField filename = new TextField();
        upload.setReceiver((name, type) -> {
            filename.setValue(name);
            return new ByteArrayOutputStream();
        });
        this.add(upload, filename);
    }
}

Versions

  • Vaadin / Flow version: 23.4.1
  • Java version: 11
  • OS version:
  • Browser version (if applicable):
  • Application Server (if applicable):
  • IDE (if applicable):
@mcollovati mcollovati transferred this issue from vaadin/flow Nov 6, 2024
@mcollovati mcollovati transferred this issue from vaadin/flow-components Nov 6, 2024
@tepi tepi added bug BFP Bugfix priority, also known as Warranty labels Nov 7, 2024
@tepi tepi moved this to 🔖 High Priority (P1) in Vaadin Flow bugs & maintenance (Vaadin 10+) Nov 7, 2024
@tepi tepi moved this to 🟢Ready to Go in Vaadin Flow ongoing work (Vaadin 10+) Nov 7, 2024
@tepi tepi self-assigned this Nov 8, 2024
@tepi tepi moved this from 🔖 High Priority (P1) to 🏗 WIP in Vaadin Flow bugs & maintenance (Vaadin 10+) Nov 8, 2024
@tepi tepi moved this from 🟢Ready to Go to ⚒️ In progress in Vaadin Flow ongoing work (Vaadin 10+) Nov 8, 2024
@tepi tepi removed their assignment Nov 11, 2024
@tepi
Copy link
Contributor

tepi commented Nov 11, 2024

@mas4ivv I was unfortunately unable to reproduce this on macOS (uses UTF-8 by default, but even changing the server JVM encoding to soemthing else did not break the special characters). Are you by any chance on Windows?

@tepi tepi moved this from 🏗 WIP to 🔖 High Priority (P1) in Vaadin Flow bugs & maintenance (Vaadin 10+) Nov 11, 2024
@tepi tepi moved this from ⚒️ In progress to 🟢Ready to Go in Vaadin Flow ongoing work (Vaadin 10+) Nov 11, 2024
@mas4ivv
Copy link
Author

mas4ivv commented Nov 11, 2024

Yes, the example was run on Jetty on Windows. Our production server is running on Linux and has the same problem.

@tepi
Copy link
Contributor

tepi commented Nov 11, 2024

Alright, in that case it should be reproducable on mac as well since I think most Linux default to utf-8 anyway. I'll retry.

@tltv tltv self-assigned this Nov 12, 2024
@tltv
Copy link
Member

tltv commented Nov 14, 2024

@mas4ivv Could you please clarify these questions to help us reproducing the issue and find good fix for it. Also if possible, attaching full example application would also answer some of these:

  • which browser you use? or is it maybe same with all you try?
  • which jetty version you use?
  • is there any customization in the app for jetty configurations or request handlers/filters that could change request's character encoding (with ServletRequest.setCharacterEncoding(String))?

Example code generates a request with content type of multipart/form-data with one part in the request body. This looks correct in the screenshot. With parts in the body, StreamReceiverHandler.getItemIterator(VaadinRequest) method should not be called at all. There's a hasParts method that decides how to process the body, and both are using different APIs. If possible, could you confirm by putting a brake point in StreamReceiverHandler.getItemIterator(VaadinRequest) and confirm if it's actually called? I can't see how it would with such a request. But if it does, then that seems a bug.

Updating Flow to always call ServletFileUpload.setHeaderEncoding("UTF-8") would probably not help with parts in the body. But forcing it to UTF-8 in other place could potentially fix the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFP Bugfix priority, also known as Warranty bug Impact: Low Severity: Major
Projects
Status: 🔖 High Priority (P1)
Status: ⚒️ In progress
Development

No branches or pull requests

4 participants