NDG Linux Essentials - Chapter 7 - Archiving and Compression Share
Archiving
Combining multiple files into one, which eliminates the overhead in individual files and makes it easier to transmit
common tar options
Creating an archive requires two named options. The first, c, specifies the mode. The second, f, tells tar to expect a file name as the next argument. Additionally, the t option lists files in the archive. Finally you can extract the archive with the -x flag.
Benefits of archiving and compressing
If you want to make a large number of files available, such as the source code to an application or a collection of documents, it is easier for people to download a compressed archive than it is to download individual files. Log files have a habit of filling disks so it is helpful to split them by date and compress older versions. When you back up directories, it is easier to keep them all in one archive than it is to version each file. Some streaming devices such as tapes perform better if you're sending a stream of data rather than individual files. It can often be faster to compress a file before you send it to a tape drive or over a slower network and decompress it on the other end than it would be to send it uncompressed.
Lossy
Information might be removed from the file as it is compressed so that uncompressing a file will result in a file that is slightly different than the original. For instance, an image with two subtly different shades of green might be made smaller by treating those two shades as the same. Often, the eye can't pick out the difference anyway.
gzip
Linux provides several tools to compress files, the most common is gzip. The gzip method uses the .gz file extension.
compressing
Making the files smaller by removing redundant information
Lossesless
No information is removed from the file. Compressing a file and decompressing it leaves something identical to the original.
Unarchiving
Taking an archived file decompressing it and extracting one or more files
bzip2
The bzip utilities use a different compression algorithm (called Burrows-Wheeler block sorting) that can compress files smaller than gzip at the expense of more CPU time. You can recognize these files because they have a .bz or bz2 extension.
zip
The de facto archiving utility in the Microsoft world is the ZIP file. It is not as prevalent in Linux but is well supported by the zip and unzip commands. The default mode of zip is to add files to an archive and compress it.
unzip
The default operation of the unzip command is to extract files. Listing files in the zip is done by the unzip command and the -l option (list).
tar
The traditional UNIX utility to archive files is called tar, which is a short form of TApe aRchive. Tar takes in several files and creates a single output file that can be split up again into the original files on the other end of the transmission.
Can you archive multiple files into one file?
Yes
Types of compression
lossy and lossless
File archiving
used when one or more files need to be transmitted or stored as efficiently as possible. There are two aspects to this: