Thursday, December 1, 2022
HomeNetworkingInstructions for locating out if compressed Linux recordsdata are the identical

Instructions for locating out if compressed Linux recordsdata are the identical


Compressed Linux recordsdata are useful as a result of they save disk area, however what must you do when you may have a collection of compressed recordsdata and need to decide if any are duplicates? The zdiff and zcmp instructions will help.

To start, if a listing accommodates two recordsdata like these beneath, it’s straightforward to inform simply from the itemizing that they don’t seem to be an identical. In any case, the file sizes are a bit totally different. The recordsdata appear like this:

$ ls -l
complete 200
-rw-r--r--. 1 shs shs 102178 Nov 22  2021 2021.gz
-rw-r--r--. 1 shs shs 102181 Nov 22 11:19 2022.gz

In case you examine the recordsdata with the diff command, it is going to verify that the recordsdata differ:

$ diff 2021.gz 2022.gz
Binary recordsdata 2021.gz and 2022.gz differ

What the diff command doesn’t let you know (as a result of it examines the recordsdata byte by byte) is that the fabric that was compressed in creating these two recordsdata truly is an identical. To find out that, you would wish to make use of the zdiff or the zcmp command. If the file content material that was compressed in every file is an identical, you’ll get no output from the command from both of those instructions.

$ zdiff 2021.gz 2022.gz
$
$ zcmp 2021.gz 2022.gz
$

After utilizing gunzip to decompress the recordsdata, the ensuing recordsdata are the identical dimension and may be in contrast with the diff command to verify their an identical content material. Once more, the absence of output from the diff command signifies that there aren’t any variations.

$ gunzip 2021.gz
$ gunzip 2022.gz
$ ls -l
complete 852
-rw-r--r--. 1 shs shs 383654 Nov 22  2021 2021
-rw-r--r--. 1 shs shs 383654 Nov 22 11:19 2022
$ diff 2021 2022
$

Clearly, the file content material is similar. Why, then, do the compressed variations seem like totally different? That’s as a result of gzip retains the unique file identify and contains the file’s timestamp when it compresses a file. This data just isn’t included within the comparisons.

Evaluating compressed and non-compressed recordsdata

Whereas each the zdiff and zcmp instructions can decide whether or not two compressed recordsdata are the identical, they’ll additionally examine the content material of a compressed file with a non-compressed file. In different phrases, for those who examine a compressed file with the file that accommodates the unique content material however just isn’t compressed, you’ll nonetheless get affirmation that the content material matches.

$ zdiff 2021.gz 2022
$
$ zcmp 2021.gz 2022
$

In reality, though there is no profit to utilizing zdiff and zcmp with non-compressed recordsdata, the instructions would nonetheless comply along with your request. The command beneath compares the 2 recordsdata when each are decompressed.

$ zdiff 2021 2022
$

zdiff and zcmp variations

The primary distinction between the zdiff and zcmp instructions is what they let you know when recordsdata are totally different. In case you use the zdiff command, it is going to show any variations detected within the compressed content material.

$ zdiff 2022.gz 2023.gz
6409c6409
<        There could also be just one lively coprocess at a time.
---
>        There could also be just one lively coprocess at a time!

In case you use the zcmp command, it is going to let you know that the file content material is totally different and the place any variations are positioned by byte and line quantity.

$ zcmp 2022.gz 2023.gz
/dev/fd/5 - differ: byte 383573, line 6409

Wrap-Up

The zdiff and zcmp instructions help you examine the content material of recordsdata compressed with gzip. Whereas each instructions will present no output if the file content material matches, they are going to present totally different particulars when the recordsdata are totally different. You can even use these instructions to check recordsdata compressed with gzip to recordsdata that aren’t compressed with a view to decide if the unique content material is similar in each.

Copyright © 2022 IDG Communications, Inc.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments