From c42ea012aa947bd2801469342b88aa3295fb16aa Mon Sep 17 00:00:00 2001 From: Schneems Date: Thu, 6 Jun 2024 17:04:16 -0500 Subject: [PATCH] Update tar-compress docs - Call finish to finalize the archive [recommended by tar](https://docs.rs/tar/0.4.41/tar/struct.Builder.html#method.append_dir_all). - Show an example of adding an archive **without** renaming it. - Callout differences between `tar::Builder` defaults and `tar(1)` defaults ## Context On seeing these docs I was unsure if there was another, better way, to add all the contents of one directory into an archive. After researching, I see people use this function by call it with an empty string. I feel this is a common operation and would like to see it spelled out in the documentation. In addition, I hit an edge case where `tar(1)` produced significantly smaller files than the `tar` crate, reproduction: https://github.com/schneems/tar_comparison/blob/bfd420a012b46e80435cf4e7c67ca1661357fde3/README.md. It turns out that the problem was that the directory contained symlinks to other files in the same directory. The `follow_symlinks(true)` behavior is on by default and is the opposite of the `tar(1)` default. It caused the program to duplicate the same file multiple times which significantly increased the archive size. I believe someone looking for "How to Compress a directory into tarball" is likely looking to replicate `tar czf` behavior and would like to know the main differences. --- src/compression/tar/tar-compress.md | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/src/compression/tar/tar-compress.md b/src/compression/tar/tar-compress.md index 25735a63..d2454343 100644 --- a/src/compression/tar/tar-compress.md +++ b/src/compression/tar/tar-compress.md @@ -6,7 +6,7 @@ Compress `/var/log` directory into `archive.tar.gz`. Creates a [`File`] wrapped in [`GzEncoder`] and [`tar::Builder`].
Adds contents of `/var/log` directory recursively into the archive -under `backup/logs`path with [`Builder::append_dir_all`]. +under `backup/logs` path with [`Builder::append_dir_all`]. [`GzEncoder`] is responsible for transparently compressing the data prior to writing it into `archive.tar.gz`. @@ -21,11 +21,35 @@ fn main() -> Result<(), std::io::Error> { let enc = GzEncoder::new(tar_gz, Compression::default()); let mut tar = tar::Builder::new(enc); tar.append_dir_all("backup/logs", "/var/log")?; + tar.finish()?; Ok(()) } ``` +To add the contents without renaming them, an empty string can be used as the first argument of [`Builder::append_dir_all`]: + +```rust,edition2018,no_run + +use std::fs::File; +use flate2::Compression; +use flate2::write::GzEncoder; + +fn main() -> Result<(), std::io::Error> { + let tar_gz = File::create("archive.tar.gz")?; + let enc = GzEncoder::new(tar_gz, Compression::default()); + let mut tar = tar::Builder::new(enc); + tar.append_dir_all("", "/var/log")?; + tar.finish()?; + Ok(()) +} +``` + +The default behavior of [`tar::Builder`] differs from the GNU `tar` utility's defaults [tar(1)], +notably [`tar::Builder::follow_symlinks(true)`] is the equivalent of `tar --dereference`. + +[tar(1)]: https://man7.org/linux/man-pages/man1/tar.1.html [`Builder::append_dir_all`]: https://docs.rs/tar/*/tar/struct.Builder.html#method.append_dir_all [`File`]: https://doc.rust-lang.org/std/fs/struct.File.html [`GzEncoder`]: https://docs.rs/flate2/*/flate2/write/struct.GzEncoder.html [`tar::Builder`]: https://docs.rs/tar/*/tar/struct.Builder.html +[`tar::Builder::follow_symlinks(true)`]: https://docs.rs/tar/latest/tar/struct.Builder.html#method.follow_symlinks