It’s not an uncommon requirement for Unix system administrators to know the difference between two files. The diff command in Unix serves the purpose. Here I am going to discuss the diff command. It’s quite common but little understood command. I hope after reading this article, the Unix visitors will be able to understand the usage properly and benefit from it. The other usefule command is: comm command. Here you go…
The example files are first and second. The example files are listed below:
wiw_labs:$ nl first
1 computer
2 modem
3 monitor
4 phone
5 switchwiw_labs:$ nl second
1 cable
2 mobile
3 screen
4 modem
5 phone
6 server
The diff command is used to differentiate between the files.
How diff Command Works
Let’s start by describing the usage of diff command. The diff command general usage is:
diff first_file second_file
So, you can read the command as:
How first_file is different from second_file.
Philosophy of diff Command
The diff command works on the philosophy of changing the first file in any way to make it appear like second file. It wants the lines of the first file to be changed(c), deleted(d) to make it ditto as second file. If need be, it instructs to append the lines from second file to the first file. If you got what I said is okay, otherwise leave it, You’ll understand when I explain it with example.
Here are the steps which diff command follows to produce the difference between the files:
- It starts with the first line of the first file and second file. If these match then it’s okay otherwise it keeps on traveling down the first file till it finds the similar entry in second file.
- If first line of second file is not found in the first file, it’ll start with the second line of the second file. It’ll start it’s search in the first file. Then it’ll suggest what to do(append, change or delete).
Enough about theory. Let’s come to practical example to make it clear.
I have pasted the files side by side to make it easy to understand. Besides line numbers are also printed.
wiw_labs:$ paste first second|nl
1 computer cable
2 modem mobile
3 monitor screen
4 phone modem
5 switch phone
6 serverwiw_labs:$ diff first second
1c1,3
< computer
—
> cable
> mobile
> screen
3d4
< monitor
5c6
< switch
—
> server
Now, take a look at numbered output of paste command above. The things to be noted are:
- The second line(modem) of first file matches with the fourth line(modem) of second file. So, if we replace the first line of first file with first three lines of second file then first part of both file becomes same. The output will resemble as below:
- The fourth line(phone) of first file matches with fifth line(phone) of the second file. That means if we delete the third line of first file(which is the fourth line at present, the second part of files will match.
- The fifth line(switch) of first file can be replace with 6th line(server) of second file. So, both of the files match fully.
wiw_labs:$ paste first second|nl
1 cable cable
2 mobile mobile
3 screen screen
4 modem modem
5 monitor phone
6 phone server
7 switch
wiw_labs:$ paste first second|nl
1 cable cable
2 mobile mobile
3 screen screen
4 modem modem
5 phone phone
6 switch server
wiw_labs:$ paste first second|nl
1 cable cable
2 mobile mobile
3 screen screen
4 modem modem
5 phone phone
6 server server
Now, its easier to understand the output of diff command.
1c1,3: Change the first line of first file with lines 1 to 3 of second file.
3d4: Delete the line 3(modem) from first file.
5c6: Change the 5th line(switch) of first file with 6th line(server) of second file.
Now, take the reverse case:
wiw_labs:$ paste second first | nl
1 cable computer
2 mobile modem
3 screen monitor
4 modem phone
5 phone switch
6 serverwiw_labs:$ diff second first
1,3c1
< cable
< mobile
< screen
—
> computer
4a3
> monitor
6c5
< server
—
> switch
- Now, see the 4th line(modem) of the first file matches with the 2nd line of the second file. So, if we replace the lines 1st through 3rd of first file with the 1st line of second file we get the following output:
- Now, 3rd line (monitor) of second file does not exist in first file. So, append it after 4th line(modem) of first file. Do remember that line numbers specified in output of diff command are always the original line number. So, output will be something like this.
- The last line, 6th line(server) of first file now needs to be changed with the last line 5th line of second file(switch). After doing so, we get first file as second file.
wiw_labs:$ paste second first | nl
1 computer computer
2 modem modem
3 phone monitor
4 server phone
5 switch
wiw_labs:$ paste second first | nl
1 computer computer
2 modem modem
3 monitor monitor
4 phone phone
5 server switch
1 computer computer
2 modem modem
3 monitor monitor
4 phone phone
5 switch switch
Now, its easier to understand the output of diff command.
1,3c1: Change the 1st through 3rd line of first file with lines 1st of second file.
4a3: Append the line 3(monitor) from second file after 4th line(modem) of first file.
6c5: Change the 6th line(server) of first file with 5th line(switch) of second file.
Thank you. Very useful.
The article is very well written and clearly explains the diff command. This was the most clear and easy to understand explanation of diff command I have read till now. Thanks.