Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VFS-676 : change dependency from java.util.zip to Commons Compress #91

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

PeterAlfredLee
Copy link
Member

This PR is to change the currently using zip dependency from java.util.zip to Commons Compress. Including these changes :

  1. The JarFIleObject no longer extends from ZipFileObject, and the JarFileSystem no longer extends from ZipFileSystem. The jar module in Commons Compress could not achieve fit what we need in vfs, cause it did not implement some methods like getCertificates that are need in vfs. So I have decoupled JarFIleObject/JarFileSystem from ZipFileObject/ZipFileSystem, and moved the methods needed from ZipFileObject/ZipFileSystem to JarFIleObject/JarFileSystem. This change makes jar and zip two separate modules.
  2. The ZipFile in Commons Compress open zip archivers using Files.newByteChannel, while the ZipFile in 'java.util.zip' is using a native method. This means the zip archivers can be safely deleted after reading by ZipFIle - and some tests are failing now and I have ignored them, say testCannotDeleteWhileStreaming and testCannotDeleteWhileStreaming2.

@garydgregory
Copy link
Member

@PeterAlfredLee,
What problem are you trying to solve here?

As it stands, this PR breaks binary compatibility, so would be a -1 for 2.x. We should only break BC in a new major version.

@PeterAlfredLee
Copy link
Member Author

What problem are you trying to solve here?

As the issue VFS-676 said, I'm trying to use Commons Compress instead of java.util.zip for ZipFileObject and ZipFileSystem.

As it stands, this PR breaks binary compatibility, so would be a -1 for 2.x.

This PR doesn't break the existing APIs - as you can see, no existing tests need to be modified (but only to ignore 2 test cases).
I decoupled JarFileObject from ZipFileObject cause Commons Compress does not support Jar that well as java.util.zip JDK do. Seems we have no other options if we deside to use Commons Compress.

@garydgregory
Copy link
Member

@PeterAlfredLee
I now understand why you want this PR.

WRT to BC please see https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html
and run mvn clean install site -P japicmp and look at the resulting report. You can add -DskipTests to skip running tests (but still compile them.) This will fail with details as to what is broken.

As a reference, the current report is here: https://commons.apache.org/proper/commons-compress/japicmp.html

@PeterAlfredLee
Copy link
Member Author

Thank you for your explaining about BC, and I understand it now. :)

The report is here : japicmp.zip

As you can see, only some protected methods and some super classes are modified. Not sure if this is fine, but it seems I have to do this if we want to use Commons Compress instead of java.util.zip.

@XenoAmess
Copy link
Contributor

IMO seems the biggest bc-breaking in this pr is changing JarFileObject's super class from ZipFileObject to AbstractFileObject, thus codes like this will break:

ZipFileObject getZipFileObject(...){
...
    return someJarFileObjectOrZipFileObject;
}
void anotherFunction (){
    ZipFileObject receive = getJarFileObject();
}

@PeterAlfredLee
Copy link
Member Author

IMO seems the biggest bc-breaking in this pr is changing JarFileObject's super class from ZipFileObject to AbstractFileObject, thus codes like this will break

Agreed. And as I said - seems we got no other options if we want to use Commons Compress.

@ecki
Copy link
Contributor

ecki commented May 27, 2020

We can add a different provider, then each developer can configure the provider for the jar schema as needed.

@garydgregory
Copy link
Member

We can add a different provider, then each developer can configure the provider for the jar schema as needed.

Before someone goes that route, it seems fair to try to update to Commons Compress without breaking BC.

@garydgregory
Copy link
Member

Darn, I did not mean to close the PR.

@garydgregory garydgregory reopened this May 27, 2020
@garydgregory
Copy link
Member

Maybe what we need is a Commons Compress provider..., then all that Commons Compress can do we can surface from a single provider. Thoughts?

@XenoAmess
Copy link
Contributor

XenoAmess commented May 27, 2020

Maybe what we need is a Commons Compress provider..., then all that Commons Compress can do we can surface from a single provider. Thoughts?

Good idea, that reminds me the good old days when I be at ms...forking a provider for odatav4 from what for odatav2.
That sounds both neat and convenient.
But in that way we got a samilar problem I asked several hours ago in dev-mail:
where should we put this provider?
in commons-vfs? or in commons-compress?
or create a new repo for such classes?

maybe we shall put it as a submodule like vfs-jackrabbit?

@ecki
Copy link
Contributor

ecki commented May 27, 2020

gary, that was my idea so yes. since we already have multiple providers for http thats fine for compression too.

It might not wortk for the util.VFSClassLoader util class as that is not related to an classloader, but this one could be changed to introspect the Jar file for signature checking (i think it had some dependency in there but i can be wrong)

@ecki
Copy link
Contributor

ecki commented May 27, 2020

Commons-compress is already a dependency, so it would not hurt to put the alternative provider in core (in this case)

@PeterAlfredLee
Copy link
Member Author

Maybe what we need is a Commons Compress provider..., then all that Commons Compress can do we can surface from a single provider. Thoughts?

Indeed it's a good idea. Thank you guys for your suggestions! Will update in this PR in the following days ... hoping I could finish this whthin this week.

where should we put this provider?

I think it should belong VFS.

@PeterAlfredLee
Copy link
Member Author

And about the naming ... do you get some ideas?

since we already have multiple providers for http

The existing multiple http providers are http, http4, http5 which are corresponding to three different http versions. Then what should we have the name of the new provider? CommonsCompressZipFileProvider, CCZipFileProvider, or having the same name as ZipFileProvider?
I prefer the CommonsCompressZipFileProvider but it may be a little confusing. WDYT?

@bvitaliyg
Copy link

CommonsCompressZipFileProvider or CCZipFileProvider look good for me.

Btw, that's the reason behind read-only zip support? It that just because of zip file provider implementation limitations or there's something else?

@PeterAlfredLee
Copy link
Member Author

Just pushed the new provider. The scheme's name is commonscompresszip and the new provider's name is CommonsCompressZipFileProvider. Not sure if this is a good name or not. Ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants