Last modified: June 11, 2024

This article is written in: 🇺🇸

Essential File Compression and Archiving Commands

In Unix-based systems such as Linux, tar and gzip are commonly used command-line tools for file packaging and compression. Understanding these tools is crucial for managing files efficiently.

Imagine having multiple files and directories you need to consolidate for ease of transportation or storage. You can think of tar as a tool that packs these files into a single 'tarball', somewhat similar to packing a box with different items.

Folder1       Folder2       File1
  |              |             |
  v              v             v
+---------------------------------+
|           Tar Archive           |
+---------------------------------+

Now, you want to make this packed box lighter and easier to carry around. That's what gzip does to the 'tarball'. It compresses the file to reduce its size, like vacuum sealing the box to make it more compact.

+---------------------------------+
|           Tar Archive           |
+---------------------------------+
                   |
                   v
+--------------------------+
|   Gzipped Tar Archive    |
+--------------------------+

The tar Command

The tar command is a powerful tool for archiving files and directories in a Linux system. It creates an archive, preserving important details such as filenames, ownership information, timestamps, and access permissions. By default, tar only packages files without compressing them. However, it can work in tandem with tools like gzip to compress the resulting archive.

Here are some commonly used tar command options:

Flag Description
-c Create an archive
-v Verbosely list the files processed
-f Use archive file
-z Filter the archive through gzip
-x Extract files from an archive

Note that the -f flag should precede the archive file name.

For instance, to create an archive named "myfiles.tar" containing the "mytree" directory and "file1" and "file2" files, you would use the following command:

tar -cvf myfiles.tar mytree file1 file2

To extract files from the "myfiles.tar" archive, this command can be used:

tar -xvf myfiles.tar

The gzip command

gzip is a software application used for file compression and decompression. When paired with tar, it can greatly reduce the size of an archive file. Here are some commonly used gzip command options:

Flag Description
-d Decompress files
-l Display compression information for each specified file

To compress the "mytar.tar" file, thereby creating a new file called "mytar.tar.gz", you would use this command:

gzip mytar.tar

To decompress the "file_name.gz" file, use the following command:

gzip -d file_name.gz

Challenges

  1. Make an archive of your home folder using tar. To check that everything was included, copy your archives to the /tmp folder and extract the files there. Delete the copies from /tmp when done.
  2. Use tar with and without the -z option to create an archive of any folder. Compare the sizes of your original folder, the archive, and the compressed archive.
  3. Create a script that compresses all .txt files in a given folder using gzip. The script should skip already compressed files (with .gz extension).
  4. Use a combination of tar and gzip commands to create a compressed archive of a folder. Then, extract the archive to a new location and compare the contents of the original folder and the extracted folder to ensure they are identical.
  5. Create a compressed archive using tar and gzip. Then, use the gzip -l command to view the compression ratio and other details about the compressed archive.
  6. Write a script that takes a folder as input and creates an archive for each subfolder within that folder. The script should name the archives based on the subfolder names.
  7. Compress a folder using tar and gzip, then decompress it using the same tools. Time how long each operation takes and compare the results.
  8. Create a compressed archive of a folder containing various file types (e.g., text files, images, audio files). Use the file command to determine the types of files in the compressed archive without extracting it.
  9. Write a script that finds the largest file in a directory and compresses it using gzip. The script should display the file name and its size before and after compression.

Table of Contents

  1. Essential File Compression and Archiving Commands
  2. The tar Command
  3. The gzip command
  4. Challenges