Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Folder exclusion issue #35

Closed
Erol-2022 opened this issue Oct 19, 2022 · 9 comments
Closed

Folder exclusion issue #35

Erol-2022 opened this issue Oct 19, 2022 · 9 comments

Comments

@Erol-2022
Copy link

Erol-2022 commented Oct 19, 2022

Hello,

Is it possible to restrict the recursion depth of the -not option while creating archives with zpaqfranz?

Let's consider this batch file creating a simple folder hierarchy, sample.bat :

md C:\Documents\Department1\Fred\Archive

md C:\Documents\Department1\Fred\Docs

md C:\Documents\Department1\Fred\test1\Archive

md C:\Documents\Department2\George\Docs

md C:\Documents\Department2\George\Archive

md C:\Documents\Department2\George\folder\sample\Archive

echo test1 > C:\Documents\Department1\Fred\Archive\file1.txt

echo test2 > C:\Documents\Department2\George\Archive\file2.txt

echo test3 > C:\Documents\Department1\Fred\file3.txt

echo test4 > C:\Documents\Department2\George\file4.txt

echo test5 > C:\Documents\Department2\George\folder\sample\Archive\file5.txt

echo test6 > C:\Documents\Department2\George\Docs\test6.txt

echo test7 > C:\Documents\Department1\Fred\test7.txt

echo test8 > C:\Documents\Department2\George\test8.txt

echo test9 > C:\Documents\Department1\Fred\test1\Archive\test9.txt

Creating the backup :

zpaqfranz.exe a backup c:\Documents -not C:\Documents\ * \ * \Archive

zpaqfranz v55.16h-experimental-JIT-L (HW BLAKE3), SFX64 v55.1, (03 Oct 2022)
Creating backup.zpaq at offset 0 + 0
Adding 40 (40.00 B) in 5 files (10 dirs), 4 threads @ 2022-10-19 23:22:28
15 +added, 0 -removed.

0 + (40 -> 40 -> 1.561) = 1.561 @ 512.00 B/s

0.094 seconds (00:00:00)  (all OK)

In this example, my expectation was to exclude only the folders below :

C:\Documents\Department1\Fred\Archive
C:\Documents\Department2\George\Archive

zpaqfranz is also excluding those folders :

C:\Documents\Department1\Fred\test1\Archive
C:\Documents\Department2\George\folder\sample\Archive

zpaqfranz is excluding all the folders named Archive. I am also experiencing the same problem with Matt Mahoney's zpaq archiver.

I guess I didn't construct properly the parameters to exclude only the necessary folders :

zpaqfranz.exe a backup c:\Documents -not C:\Documents\ * \ * \Archive

Kindly, could you please point me the right direction? Thanks.
sample.txt

@fcorbelli
Copy link
Owner

Thank for the question
I will answer tomorrow (I am now working on zpaqfranz over socket)

@fcorbelli
Copy link
Owner

You can use multiple -not

zpaqfranz a z:\1.zpaq c:\Documents -not C:\Documents\Department1\Fred\Archive -not C:\Documents\Department2\George\Archive

notfiles is a vector, not a string

vector<string> notfiles;  			// list of prefixes to exclude

@Erol-2022
Copy link
Author

Erol-2022 commented Oct 20, 2022

Hello,

Thanks, your configuration works fine but what if I have a lot user profiles in my folder C:\Documents?. I would need to specify a lot of exclusions in the command prompt using multiple -not statements.

@fcorbelli
Copy link
Owner

Write exaclty what kind of exclusions you want and I will implement

@Erol-2022
Copy link
Author

Hello,

Thanks again for your support.

I guess it would be interesting to implement the -maxdepth option provided by the UNIX\Linux command find :

find . -maxdepth 1 -type d ......

zpaqfranz.exe a backup c:\Documents -not C:\Documents\ * \ * \Archive -maxdepth 1

@fcorbelli
Copy link
Owner

After studying the situation, I'm not very supportive

The recursive function used (ispath ()) is used in many places in the source, in particular it is used to add (i.e. choose the files to compress)

Changing it would mean exposing yourself to possible very serious side effects (= does not copy all the data you want). A different function would have to be done, specifically for -not, but the source is already becoming a true example of lasagna code (aka spaghetti 2.0)

At the moment I don't think I can satisfy you

Maybe after finishing the zpaqfranz-over-TCP part I'll think about it again

@fcorbelli
Copy link
Owner

fcorbelli commented Oct 21, 2022

Can you write here a regex for what you need with

 // allowed metasymbols: 
// ^  -> text must start with pattern (only allowed as first symbol in regular expression) // $  -> text must end   with pattern (only allowed as last  symbol in regular expression) 
// .  -> any literal // x? -> literal x occurs zero or one time 
// x* -> literal x is repeated arbitrarily 
// x+ -> literal x is repeated at least once

In this case it can be doable (with a small-scale regex matcher)

@Erol-2022
Copy link
Author

Hello,

No worries if the limitation of the recursive search would not work.

The regex expression :

zpaqfranz.exe a backup c:\Documents -not C:\Documents\ * \ * \^Archive$

Is the syntax above correct for you? Only the Archive folders in the 3rd recursion level should be excluded.

@fcorbelli
Copy link
Owner

If is much harder, because a "full" regex (with groups) requires a LOT of code, just for expression parsing
Limiting the recursion level is hard too, because you can have multiple *, requiring multiple recursion level
-not C:\Documents\*2 \*3 \*1\^Archive$
=> recurse max 2 times the first, 3 the second, 1 the third (...) different -maxdepth for different jolly

Otherwise it would essentially become a hardwired specific feature for your situation

=> do not think to implement anytime soon, I am currently working on this topic (even harder) zpaqfranz-over-ssh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants