How to compare two files











up vote
66
down vote

favorite
15












So basically what I want to do is compare two file by line by column 2. How could I accomplish this?



File_1.txt:



User1 US
User2 US
User3 US


File_2.txt:



User1 US
User2 US
User3 NG


Output_File:



User3 has changed









share|improve this question




















  • 11




    Use diff "File_1.txt" "File_2.txt"
    – Pandya
    Aug 25 '14 at 15:00












  • Also visit : askubuntu.com/q/12473
    – Pandya
    Aug 26 '14 at 1:56















up vote
66
down vote

favorite
15












So basically what I want to do is compare two file by line by column 2. How could I accomplish this?



File_1.txt:



User1 US
User2 US
User3 US


File_2.txt:



User1 US
User2 US
User3 NG


Output_File:



User3 has changed









share|improve this question




















  • 11




    Use diff "File_1.txt" "File_2.txt"
    – Pandya
    Aug 25 '14 at 15:00












  • Also visit : askubuntu.com/q/12473
    – Pandya
    Aug 26 '14 at 1:56













up vote
66
down vote

favorite
15









up vote
66
down vote

favorite
15






15





So basically what I want to do is compare two file by line by column 2. How could I accomplish this?



File_1.txt:



User1 US
User2 US
User3 US


File_2.txt:



User1 US
User2 US
User3 NG


Output_File:



User3 has changed









share|improve this question















So basically what I want to do is compare two file by line by column 2. How could I accomplish this?



File_1.txt:



User1 US
User2 US
User3 US


File_2.txt:



User1 US
User2 US
User3 NG


Output_File:



User3 has changed






command-line text-processing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 30 at 5:11









muru

135k19289490




135k19289490










asked Aug 25 '14 at 14:58









Roboman1723

86851528




86851528








  • 11




    Use diff "File_1.txt" "File_2.txt"
    – Pandya
    Aug 25 '14 at 15:00












  • Also visit : askubuntu.com/q/12473
    – Pandya
    Aug 26 '14 at 1:56














  • 11




    Use diff "File_1.txt" "File_2.txt"
    – Pandya
    Aug 25 '14 at 15:00












  • Also visit : askubuntu.com/q/12473
    – Pandya
    Aug 26 '14 at 1:56








11




11




Use diff "File_1.txt" "File_2.txt"
– Pandya
Aug 25 '14 at 15:00






Use diff "File_1.txt" "File_2.txt"
– Pandya
Aug 25 '14 at 15:00














Also visit : askubuntu.com/q/12473
– Pandya
Aug 26 '14 at 1:56




Also visit : askubuntu.com/q/12473
– Pandya
Aug 26 '14 at 1:56










11 Answers
11






active

oldest

votes

















up vote
77
down vote



accepted










Look into the diff command. It's a good tool, and you can read all about it by typing man diff into your terminal.



The command you'll want to do is diff File_1.txt File_2.txt which will output the difference between the two and should look something like this:



enter image description here



A quick note on reading the output from the third command: The 'arrows' (< and >) refer to what the value of the line is in the left file (<) vs the right file (>), with the left file being the one you entered first on the command line, in this case File_1.txt



Additionally you might notice the 4th command is diff ... | tee Output_File this pipes the results from diff into a tee, which then puts that output into a file, so that you can save it for later if you don't want to view it all on the console right that second.






share|improve this answer























  • Can this do other files (such as images)? Or is it limited to just documents?
    – Gregory Opera
    Aug 27 '14 at 13:10






  • 2




    As far as I know, it's limited to text files. Code will work, as it's essentially text, but any binary files (which pictures are) will just get junk out. You CAN compare to see if they're identical by doing: diff file1 file2 -s. Here's an example: imgur.com/ShrQx9x
    – Mitch
    Aug 27 '14 at 13:26










  • Is there a way to colorize the output? I'd like keeping it CLI-only, but with some more... human touch.
    – Lazar Ljubenović
    Nov 27 at 17:02


















up vote
33
down vote













Or you can use Meld Diff




Meld helps you compare files, directories, and version controlled
projects. It provides two- and three-way comparison of both files and
directories, and has support for many popular version control systems.




Install by running:



sudo apt-get install meld


Your example:



enter image description here



Compare directory:



enter image description here



Example with full of text:



enter image description here






share|improve this answer






























    up vote
    15
    down vote













    You can use vimdiff.



    Example:



    vimdiff  file1  file2





    share|improve this answer























    • this one has colors
      – Jake Toronto
      Jun 19 at 14:58


















    up vote
    8
    down vote













    Meld is a really great tool. But you can also use diffuse to visually compare two files:



    diffuse file1.txt file2.txt


    enter image description here






    share|improve this answer




























      up vote
      8
      down vote













      FWIW, I rather like what I get with side-by-side output from diff



      diff -y -W 120 File_1.txt File_2.txt


      would give something like:



      User1 US                            User1 US
      User2 US User2 US
      User3 US | User3 NG





      share|improve this answer




























        up vote
        8
        down vote













        You can use the command cmp:



        cmp -b "File_1.txt" "File_2.txt"


        output would be



        a b differ: byte 25, line 3 is 125 U 116 N





        share|improve this answer




























          up vote
          7
          down vote













          Litteraly sticking to the question (file1, file2, outputfile with "has changed" message) the script below works.



          Copy the script into an empty file, save it as compare.py, make it executable, run it by the command:



          /path/to/compare.py <file1> <file2> <outputfile>


          The script:



          #!/usr/bin/env python

          import sys
          file1 = sys.argv[1]; file2 = sys.argv[2]; outfile = sys.argv[3]

          def readfile(file):
          with open(file) as compare:
          return [item.replace("n", "").split(" ") for item in compare.readlines()]

          data1 = readfile(file1); data2 = readfile(file2)
          mismatch = [item[0] for item in data1 if not item in data2]

          with open(outfile, "wt") as out:
          for line in mismatch:
          out.write(line+" has changed"+"n")


          With a few extra lines, you can make it either print to an outputfile, or to the terminal, depending on if the outputfile is defined:



          To print to a file:



          /path/to/compare.py <file1> <file2> <outputfile>


          To print to the terminal window:



          /path/to/compare.py <file1> <file2> 


          The script:



          #!/usr/bin/env python

          import sys

          file1 = sys.argv[1]; file2 = sys.argv[2]
          try:
          outfile = sys.argv[3]
          except IndexError:
          outfile = None

          def readfile(file):
          with open(file) as compare:
          return [item.replace("n", "").split(" ") for item in compare.readlines()]

          data1 = readfile(file1); data2 = readfile(file2)
          mismatch = [item[0] for item in data1 if not item in data2]

          if outfile != None:
          with open(outfile, "wt") as out:
          for line in mismatch:
          out.write(line+" has changed"+"n")
          else:
          for line in mismatch:
          print line+" has changed"





          share|improve this answer






























            up vote
            4
            down vote













            An easy way is to use colordiff, which behaves like diff but colorizes its output. This is very helpful for reading diffs. Using your example,



            $ colordiff -u File_1.txt File_2.txt
            --- File_1.txt 2016-12-24 17:59:17.409490554 -0500
            +++ File_2.txt 2016-12-24 18:00:06.666719659 -0500
            @@ -1,3 +1,3 @@
            User1 US
            User2 US
            -User3 US
            +User3 NG


            where the u option gives a unified diff. This is how the colorized diff looks like:



            enter image description here



            Install colordiff by running sudo apt-get install colordiff.






            share|improve this answer



















            • 1




              If you want colors I find the diff built into vim to actually be easy to use, as in the answer by Mr.S
              – thomasrutter
              Nov 30 at 5:37


















            up vote
            2
            down vote













            Additional answer



            If there's no need to know what parts of the files differ, you can use checksum of the file. There's many ways to do that, using md5sum or sha256sum. Basically , each of them outputs a string to which a file contents hash. If the two files are the same, their hash will be the same as well. This is often used when you download software, such as Ubuntu installation iso images. They're often used for verifying integrity of a downloaded content.



            Consider script below, where you can give two files as arguments, and the file will tell you if they are the same or not.



            #!/bin/bash

            # Check if both files exist
            if ! [ -e "$1" ];
            then
            printf "%s doesn't existn" "$1"
            exit 2
            elif ! [ -e "$2" ]
            then
            printf "%s doesn't existn" "$2"
            exit 2
            fi

            # Get checksums of eithe file
            file1_sha=$( sha256sum "$1" | awk '{print $1}')
            file2_sha=$( sha256sum "$2" | awk '{print $1}')

            # Compare the checksums
            if [ "x$file1_sha" = "x$file2_sha" ]
            then
            printf "Files %s and %s are the samen" "$1" "$2"
            exit 0
            else
            printf "Files %s and %s are differentn" "$1" "$2"
            exit 1
            fi


            Sample run:



            $ ./compare_files.sh /etc/passwd ./passwd_copy.txt                                                                
            Files /etc/passwd and ./passwd_copy.txt are the same
            $ echo $?
            0
            $ ./compare_files.sh /etc/passwd /etc/default/grub
            Files /etc/passwd and /etc/default/grub are different
            $ echo $?
            1


            Older answer



            In addition there is comm command, which compares two sorted files, and gives output in 3 colums : column 1 for items unique to file #1, column 2 for items unique to file #2, and column 3 for items present in both files.



            To suppress either column you can use switches -1, -2 , and -3. Using -3 will show the lines that differ.



            Bellow you can see the screenshot of the command in action.



            enter image description here



            There is just one requirement - the files must be sorted for them to be compared properly. sort command can be used for that purpose. Bellow is another screenshot , where files are sorted and then compared. Lines starting on the left bellong to File_1 only , lines starting on column 2 belong to File_2 only



            enter image description here






            share|improve this answer























            • @DavidFoerster it's kinda hard to be editing on mobile :) Done now, though
              – Sergiy Kolodyazhnyy
              May 23 '15 at 9:17


















            up vote
            1
            down vote













            Install git and use



            $ git diff filename1 filename2


            And you will get output in nice colored format



            Git installation



            $ apt-get update
            $ apt-get install git-core





            share|improve this answer




























              up vote
              1
              down vote













              colcmp.sh



              Compares name/value pairs in 2 files in the format name valuen. Writes the name to Output_file if changed. Requires bash v4+ for associative arrays.



              Usage



              $ ./colcmp.sh File_1.txt File_2.txt
              User3 changed from 'US' to 'NG'
              no change: User1,User2


              Output_File



              $ cat Output_File
              User3 has changed


              Source (colcmp.sh)



              cmp -s "$1" "$2"
              case "$?" in
              0)
              echo "" > Output_File
              echo "files are identical"
              ;;
              1)
              echo "" > Output_File
              cp "$1" ~/.colcmp.array1.tmp.sh
              sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh
              sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array1.tmp.sh
              sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.array1.tmp.sh
              chmod 755 ~/.colcmp.array1.tmp.sh
              declare -A A1
              source ~/.colcmp.array1.tmp.sh

              cp "$2" ~/.colcmp.array2.tmp.sh
              sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
              sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
              sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
              chmod 755 ~/.colcmp.array2.tmp.sh
              declare -A A2
              source ~/.colcmp.array2.tmp.sh

              USERSWHODIDNOTCHANGE=
              for i in "${!A1[@]}"; do
              if [ "${A2[$i]+x}" = "" ]; then
              echo "$i was removed"
              echo "$i has changed" > Output_File
              fi
              done
              for i in "${!A2[@]}"; do
              if [ "${A1[$i]+x}" = "" ]; then
              echo "$i was added as '${A2[$i]}'"
              echo "$i has changed" > Output_File
              elif [ "${A1[$i]}" != "${A2[$i]}" ]; then
              echo "$i changed from '${A1[$i]}' to '${A2[$i]}'"
              echo "$i has changed" > Output_File
              else
              if [ x$USERSWHODIDNOTCHANGE != x ]; then
              USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
              fi
              USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"
              fi
              done
              if [ x$USERSWHODIDNOTCHANGE != x ]; then
              echo "no change: $USERSWHODIDNOTCHANGE"
              fi
              ;;
              *)
              echo "error: file not found, access denied, etc..."
              echo "usage: ./colcmp.sh File_1.txt File_2.txt"
              ;;
              esac




              Explanation



              Breakdown of the code and what it means, to the best of my understanding. I welcome edits and suggestions.



              Basic File Compare



              cmp -s "$1" "$2"
              case "$?" in
              0)
              # match
              ;;
              1)
              # compare
              ;;
              *)
              # error
              ;;
              esac


              cmp will set the value of $? as follows:




              • 0 = files match

              • 1 = files differ

              • 2 = error


              I chose to use a case..esac statement to evalute $? because the value of $? changes after every command, including test ([).



              Alternatively I could have used a variable to hold the value of $?:



              cmp -s "$1" "$2"
              CMPRESULT=$?
              if [ $CMPRESULT -eq 0 ]; then
              # match
              elif [ $CMPRESULT -eq 1 ]; then
              # compare
              else
              # error
              fi


              Above does the same thing as the case statement. IDK which I like better.



              Clear the Output



                      echo "" > Output_File


              Above clears the output file so if no users changed, the output file will be empty.



              I do this inside the case statements so that the Output_file remains unchanged on error.



              Copy User File to Shell Script



                      cp "$1" ~/.colcmp.arrays.tmp.sh


              Above copies File_1.txt to the current user's home dir.



              For example, if the current user is john, the above would be the same as cp "File_1.txt" /home/john/.colcmp.arrays.tmp.sh



              Escape Special Characters



              Basically, I'm paranoid. I know that these characters could have special meaning or execute an external program when run in a script as part of variable assignment:




              • ` - back-tick - executes a program and the output as if the output were part of your script

              • $ - dollar sign - usually prefixes a variable

              • ${} - allows for more complex variable substitution

              • $() - idk what this does but i think it can execute code


              What I don't know is how much I don't know about bash. I don't know what other characters might have special meaning, but I want to escape them all with a backslash:



                      sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh


              sed can do a lot more than regular expression pattern matching. The script pattern "s/(find)/(replace)/" specifically performs the pattern match.



              "s/(find)/(replace)/(modifiers)"




              • (find) = ([^A-Za-z0-9 ])


                • () = capture group 1

                • = match a character from a specific list of characters

                • [^] = match any character NOT in a specific list of characters

                • [^A-Za-z0-9 ] = match any character that is NOT a letter, digit or space




              in english: capture any punctuation or special character as caputure group 1 (\1)




              • (replace) = \\\1


                • \\ = literal character (\) i.e. a backslash

                • \1 = capture group 1




              in english: prefix all special characters with a backslash




              • (modifiers) = g


                • g = globally replace




              in english: if more than one match is found on the same line, replace them all



              Comment Out the Entire Script



                      sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.arrays.tmp.sh


              Above uses a regular expression to prefix every line of ~/.colcmp.arrays.tmp.sh with a bash comment character (#). I do this because later I intend to execute ~/.colcmp.arrays.tmp.sh using the source command and because I don't know for sure the whole format of File_1.txt.



              I don't want to accidentally execute arbitrary code. I don't think anyone does.



              "s/(find)/(replace)/"




              • (find) = ^(.*)$


                • ^ = beginning of a line

                • () = capture group 1

                • .* = anything

                • $ = end of line




              in english: capture each line as caputure group 1 (\1)




              • (replace) = #\1


                • # = literal character (#) i.e. a pound symbol or hash

                • \1 = capture group 1




              in english: replace each line with a pound symbol followed by the line that was replaced



              Convert User Value to A1[User]="value"



                      sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.arrays.tmp.sh


              Above is the core of this script.




              • convert this: #User1 US


                • to this: A1[User1]="US"

                • or this: A2[User1]="US" (for the 2nd file)




              "s/(find)/(replace)/"




              • (find) = ^#\s*(\S+)\s+(\S.?)\s$


                • ^ = beginning of a line

                • # = literal character (#) i.e. a pound symbol or hash

                • \s* - zero or more whitespace characters

                • () = capture group 1

                • \S+ - one or more NON-whitespace characters

                • \s+ - one or more whitespace characters

                • () = capture group 2

                • \S - exactly one NON-whitespace character

                • .*? = anything, non-greedy

                • \s* - zero or more whitespace characters

                • $ = end of line




              in english:




              • require but ignore leading comment characters (#)

              • ignore leading whitespace

              • capture the first word as caputure group 1 (\1)

              • require a space (or tab, or whitespace)


                • that will be replaced with an equals sign because

                • it's not part of any capture group, and because

                • the (replace) pattern puts an equals sign between capture group 1 and capture group 2



              • capture the rest of the line as capture group 2



              • (replace) = A1\[\1\]="\2"




                • A1\[ - literal characters A1[ to start array assignment in an array called A1

                • \1 = capture group 1 - which does not include the leading hash (#) and does not include leading whitespace - in this case capture group 1 is being used to set the name of the name/value pair in the bash associative array.

                • \]=" = literal characters ]="



                  • ] = close array assignment e.g. A1[User1]="US"


                  • = = assignment operator e.g. variable=value


                  • " = quote value to capture spaces ... although now that i think about it, it would have been easier to let the code above that backslashes everything to also backslash space characters.



                • \1 = capture group 2 - in this case, the value of the name/value pair

                • " = closing quote value to capture spaces




              in english: replace each line in the format #name value with an array assignment operator in the format A1[name]="value"



              Make Executable



                      chmod 755 ~/.colcmp.arrays.tmp.sh


              Above uses chmod to make the array script file executable.



              I'm not sure if this is necessary.



              Declare Associative Array (bash v4+)



                      declare -A A1


              The capital -A indicates that the variables declared will be associative arrays.



              This is why the script requires bash v4 or greater.



              Execute our Array Variable Assignment Script



                      source ~/.colcmp.arrays.tmp.sh


              We have already:




              • converted our file from lines of User value to lines of A1[User]="value",

              • made it executable (maybe), and

              • declared A1 as an associative array...


              Above we source the script to run it in the current shell. We do this so we can keep the variable values that get set by the script. If you execute the script directly, it spawns a new shell, and the variable values are lost when the new shell exits, or at least that's my understanding.



              This Should Be a Function



                      cp "$2" ~/.colcmp.array2.tmp.sh
              sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
              sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
              sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
              chmod 755 ~/.colcmp.array2.tmp.sh
              declare -A A2
              source ~/.colcmp.array2.tmp.sh


              We do the same thing for $1 and A1 that we do for $2 and A2. It really should be a function. I think at this point this script is confusing enough and it works, so I'm not gonna fix it.



              Detect Users Removed



                      for i in "${!A1[@]}"; do
              # check for users removed
              done


              Above loops through associative array keys



                          if [ "${A2[$i]+x}" = "" ]; then


              Above uses variable substitution to detect the difference between a value that is unset vs a variable that has been explicitly set to a zero length string.



              Apparently, there are a lot of ways to see if a variable has been set. I chose the one with the most votes.



                              echo "$i has changed" > Output_File


              Above adds the user $i to the Output_File



              Detect Users Added or Changed



                      USERSWHODIDNOTCHANGE=


              Above clears a variable so we can keep track of users that did not change.



                      for i in "${!A2[@]}"; do
              # detect users added, changed and not changed
              done


              Above loops through associative array keys



                          if ! [ "${A1[$i]+x}" != "" ]; then


              Above uses variable substitution to see if a variable has been set.



                              echo "$i was added as '${A2[$i]}'"


              Because $i is the array key (user name) $A2[$i] should return the value associated with the current user from File_2.txt.



              For example, if $i is User1, the above reads as ${A2[User1]}



                              echo "$i has changed" > Output_File


              Above adds the user $i to the Output_File



                          elif [ "${A1[$i]}" != "${A2[$i]}" ]; then


              Because $i is the array key (user name) $A1[$i] should return the value associated with the current user from File_1.txt, and $A2[$i] should return the value from File_2.txt.



              Above compares the associated values for user $i from both files..



                              echo "$i has changed" > Output_File


              Above adds the user $i to the Output_File



                              if [ x$USERSWHODIDNOTCHANGE != x ]; then
              USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
              fi
              USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"


              Above creates a comma separated list of users who did not change. Note there are no spaces in the list, or else the next check would need to be quoted.



                      if [ x$USERSWHODIDNOTCHANGE != x ]; then
              echo "no change: $USERSWHODIDNOTCHANGE"
              fi


              Above reports the value of $USERSWHODIDNOTCHANGE but only if there is a value in $USERSWHODIDNOTCHANGE. The way this is written, $USERSWHODIDNOTCHANGE cannot contain any spaces. If it does need spaces, above could be rewritten as follows:



                      if [ "$USERSWHODIDNOTCHANGE" != "" ]; then
              echo "no change: $USERSWHODIDNOTCHANGE"
              fi





              share|improve this answer





















                Your Answer








                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "89"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f515900%2fhow-to-compare-two-files%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                11 Answers
                11






                active

                oldest

                votes








                11 Answers
                11






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes








                up vote
                77
                down vote



                accepted










                Look into the diff command. It's a good tool, and you can read all about it by typing man diff into your terminal.



                The command you'll want to do is diff File_1.txt File_2.txt which will output the difference between the two and should look something like this:



                enter image description here



                A quick note on reading the output from the third command: The 'arrows' (< and >) refer to what the value of the line is in the left file (<) vs the right file (>), with the left file being the one you entered first on the command line, in this case File_1.txt



                Additionally you might notice the 4th command is diff ... | tee Output_File this pipes the results from diff into a tee, which then puts that output into a file, so that you can save it for later if you don't want to view it all on the console right that second.






                share|improve this answer























                • Can this do other files (such as images)? Or is it limited to just documents?
                  – Gregory Opera
                  Aug 27 '14 at 13:10






                • 2




                  As far as I know, it's limited to text files. Code will work, as it's essentially text, but any binary files (which pictures are) will just get junk out. You CAN compare to see if they're identical by doing: diff file1 file2 -s. Here's an example: imgur.com/ShrQx9x
                  – Mitch
                  Aug 27 '14 at 13:26










                • Is there a way to colorize the output? I'd like keeping it CLI-only, but with some more... human touch.
                  – Lazar Ljubenović
                  Nov 27 at 17:02















                up vote
                77
                down vote



                accepted










                Look into the diff command. It's a good tool, and you can read all about it by typing man diff into your terminal.



                The command you'll want to do is diff File_1.txt File_2.txt which will output the difference between the two and should look something like this:



                enter image description here



                A quick note on reading the output from the third command: The 'arrows' (< and >) refer to what the value of the line is in the left file (<) vs the right file (>), with the left file being the one you entered first on the command line, in this case File_1.txt



                Additionally you might notice the 4th command is diff ... | tee Output_File this pipes the results from diff into a tee, which then puts that output into a file, so that you can save it for later if you don't want to view it all on the console right that second.






                share|improve this answer























                • Can this do other files (such as images)? Or is it limited to just documents?
                  – Gregory Opera
                  Aug 27 '14 at 13:10






                • 2




                  As far as I know, it's limited to text files. Code will work, as it's essentially text, but any binary files (which pictures are) will just get junk out. You CAN compare to see if they're identical by doing: diff file1 file2 -s. Here's an example: imgur.com/ShrQx9x
                  – Mitch
                  Aug 27 '14 at 13:26










                • Is there a way to colorize the output? I'd like keeping it CLI-only, but with some more... human touch.
                  – Lazar Ljubenović
                  Nov 27 at 17:02













                up vote
                77
                down vote



                accepted







                up vote
                77
                down vote



                accepted






                Look into the diff command. It's a good tool, and you can read all about it by typing man diff into your terminal.



                The command you'll want to do is diff File_1.txt File_2.txt which will output the difference between the two and should look something like this:



                enter image description here



                A quick note on reading the output from the third command: The 'arrows' (< and >) refer to what the value of the line is in the left file (<) vs the right file (>), with the left file being the one you entered first on the command line, in this case File_1.txt



                Additionally you might notice the 4th command is diff ... | tee Output_File this pipes the results from diff into a tee, which then puts that output into a file, so that you can save it for later if you don't want to view it all on the console right that second.






                share|improve this answer














                Look into the diff command. It's a good tool, and you can read all about it by typing man diff into your terminal.



                The command you'll want to do is diff File_1.txt File_2.txt which will output the difference between the two and should look something like this:



                enter image description here



                A quick note on reading the output from the third command: The 'arrows' (< and >) refer to what the value of the line is in the left file (<) vs the right file (>), with the left file being the one you entered first on the command line, in this case File_1.txt



                Additionally you might notice the 4th command is diff ... | tee Output_File this pipes the results from diff into a tee, which then puts that output into a file, so that you can save it for later if you don't want to view it all on the console right that second.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Oct 24 '17 at 17:21

























                answered Aug 25 '14 at 15:03









                Mitch

                3,0961531




                3,0961531












                • Can this do other files (such as images)? Or is it limited to just documents?
                  – Gregory Opera
                  Aug 27 '14 at 13:10






                • 2




                  As far as I know, it's limited to text files. Code will work, as it's essentially text, but any binary files (which pictures are) will just get junk out. You CAN compare to see if they're identical by doing: diff file1 file2 -s. Here's an example: imgur.com/ShrQx9x
                  – Mitch
                  Aug 27 '14 at 13:26










                • Is there a way to colorize the output? I'd like keeping it CLI-only, but with some more... human touch.
                  – Lazar Ljubenović
                  Nov 27 at 17:02


















                • Can this do other files (such as images)? Or is it limited to just documents?
                  – Gregory Opera
                  Aug 27 '14 at 13:10






                • 2




                  As far as I know, it's limited to text files. Code will work, as it's essentially text, but any binary files (which pictures are) will just get junk out. You CAN compare to see if they're identical by doing: diff file1 file2 -s. Here's an example: imgur.com/ShrQx9x
                  – Mitch
                  Aug 27 '14 at 13:26










                • Is there a way to colorize the output? I'd like keeping it CLI-only, but with some more... human touch.
                  – Lazar Ljubenović
                  Nov 27 at 17:02
















                Can this do other files (such as images)? Or is it limited to just documents?
                – Gregory Opera
                Aug 27 '14 at 13:10




                Can this do other files (such as images)? Or is it limited to just documents?
                – Gregory Opera
                Aug 27 '14 at 13:10




                2




                2




                As far as I know, it's limited to text files. Code will work, as it's essentially text, but any binary files (which pictures are) will just get junk out. You CAN compare to see if they're identical by doing: diff file1 file2 -s. Here's an example: imgur.com/ShrQx9x
                – Mitch
                Aug 27 '14 at 13:26




                As far as I know, it's limited to text files. Code will work, as it's essentially text, but any binary files (which pictures are) will just get junk out. You CAN compare to see if they're identical by doing: diff file1 file2 -s. Here's an example: imgur.com/ShrQx9x
                – Mitch
                Aug 27 '14 at 13:26












                Is there a way to colorize the output? I'd like keeping it CLI-only, but with some more... human touch.
                – Lazar Ljubenović
                Nov 27 at 17:02




                Is there a way to colorize the output? I'd like keeping it CLI-only, but with some more... human touch.
                – Lazar Ljubenović
                Nov 27 at 17:02












                up vote
                33
                down vote













                Or you can use Meld Diff




                Meld helps you compare files, directories, and version controlled
                projects. It provides two- and three-way comparison of both files and
                directories, and has support for many popular version control systems.




                Install by running:



                sudo apt-get install meld


                Your example:



                enter image description here



                Compare directory:



                enter image description here



                Example with full of text:



                enter image description here






                share|improve this answer



























                  up vote
                  33
                  down vote













                  Or you can use Meld Diff




                  Meld helps you compare files, directories, and version controlled
                  projects. It provides two- and three-way comparison of both files and
                  directories, and has support for many popular version control systems.




                  Install by running:



                  sudo apt-get install meld


                  Your example:



                  enter image description here



                  Compare directory:



                  enter image description here



                  Example with full of text:



                  enter image description here






                  share|improve this answer

























                    up vote
                    33
                    down vote










                    up vote
                    33
                    down vote









                    Or you can use Meld Diff




                    Meld helps you compare files, directories, and version controlled
                    projects. It provides two- and three-way comparison of both files and
                    directories, and has support for many popular version control systems.




                    Install by running:



                    sudo apt-get install meld


                    Your example:



                    enter image description here



                    Compare directory:



                    enter image description here



                    Example with full of text:



                    enter image description here






                    share|improve this answer














                    Or you can use Meld Diff




                    Meld helps you compare files, directories, and version controlled
                    projects. It provides two- and three-way comparison of both files and
                    directories, and has support for many popular version control systems.




                    Install by running:



                    sudo apt-get install meld


                    Your example:



                    enter image description here



                    Compare directory:



                    enter image description here



                    Example with full of text:



                    enter image description here







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Aug 26 '14 at 12:26

























                    answered Aug 25 '14 at 15:57









                    Achu

                    15.7k136298




                    15.7k136298






















                        up vote
                        15
                        down vote













                        You can use vimdiff.



                        Example:



                        vimdiff  file1  file2





                        share|improve this answer























                        • this one has colors
                          – Jake Toronto
                          Jun 19 at 14:58















                        up vote
                        15
                        down vote













                        You can use vimdiff.



                        Example:



                        vimdiff  file1  file2





                        share|improve this answer























                        • this one has colors
                          – Jake Toronto
                          Jun 19 at 14:58













                        up vote
                        15
                        down vote










                        up vote
                        15
                        down vote









                        You can use vimdiff.



                        Example:



                        vimdiff  file1  file2





                        share|improve this answer














                        You can use vimdiff.



                        Example:



                        vimdiff  file1  file2






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited Feb 17 '15 at 14:17









                        fedorqui

                        6,04611032




                        6,04611032










                        answered Aug 28 '14 at 5:14









                        Mr. S

                        1512




                        1512












                        • this one has colors
                          – Jake Toronto
                          Jun 19 at 14:58


















                        • this one has colors
                          – Jake Toronto
                          Jun 19 at 14:58
















                        this one has colors
                        – Jake Toronto
                        Jun 19 at 14:58




                        this one has colors
                        – Jake Toronto
                        Jun 19 at 14:58










                        up vote
                        8
                        down vote













                        Meld is a really great tool. But you can also use diffuse to visually compare two files:



                        diffuse file1.txt file2.txt


                        enter image description here






                        share|improve this answer

























                          up vote
                          8
                          down vote













                          Meld is a really great tool. But you can also use diffuse to visually compare two files:



                          diffuse file1.txt file2.txt


                          enter image description here






                          share|improve this answer























                            up vote
                            8
                            down vote










                            up vote
                            8
                            down vote









                            Meld is a really great tool. But you can also use diffuse to visually compare two files:



                            diffuse file1.txt file2.txt


                            enter image description here






                            share|improve this answer












                            Meld is a really great tool. But you can also use diffuse to visually compare two files:



                            diffuse file1.txt file2.txt


                            enter image description here







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Aug 26 '14 at 6:54









                            Meysam

                            2841518




                            2841518






















                                up vote
                                8
                                down vote













                                FWIW, I rather like what I get with side-by-side output from diff



                                diff -y -W 120 File_1.txt File_2.txt


                                would give something like:



                                User1 US                            User1 US
                                User2 US User2 US
                                User3 US | User3 NG





                                share|improve this answer

























                                  up vote
                                  8
                                  down vote













                                  FWIW, I rather like what I get with side-by-side output from diff



                                  diff -y -W 120 File_1.txt File_2.txt


                                  would give something like:



                                  User1 US                            User1 US
                                  User2 US User2 US
                                  User3 US | User3 NG





                                  share|improve this answer























                                    up vote
                                    8
                                    down vote










                                    up vote
                                    8
                                    down vote









                                    FWIW, I rather like what I get with side-by-side output from diff



                                    diff -y -W 120 File_1.txt File_2.txt


                                    would give something like:



                                    User1 US                            User1 US
                                    User2 US User2 US
                                    User3 US | User3 NG





                                    share|improve this answer












                                    FWIW, I rather like what I get with side-by-side output from diff



                                    diff -y -W 120 File_1.txt File_2.txt


                                    would give something like:



                                    User1 US                            User1 US
                                    User2 US User2 US
                                    User3 US | User3 NG






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered May 22 '15 at 19:37









                                    Mike Reardon

                                    8111




                                    8111






















                                        up vote
                                        8
                                        down vote













                                        You can use the command cmp:



                                        cmp -b "File_1.txt" "File_2.txt"


                                        output would be



                                        a b differ: byte 25, line 3 is 125 U 116 N





                                        share|improve this answer

























                                          up vote
                                          8
                                          down vote













                                          You can use the command cmp:



                                          cmp -b "File_1.txt" "File_2.txt"


                                          output would be



                                          a b differ: byte 25, line 3 is 125 U 116 N





                                          share|improve this answer























                                            up vote
                                            8
                                            down vote










                                            up vote
                                            8
                                            down vote









                                            You can use the command cmp:



                                            cmp -b "File_1.txt" "File_2.txt"


                                            output would be



                                            a b differ: byte 25, line 3 is 125 U 116 N





                                            share|improve this answer












                                            You can use the command cmp:



                                            cmp -b "File_1.txt" "File_2.txt"


                                            output would be



                                            a b differ: byte 25, line 3 is 125 U 116 N






                                            share|improve this answer












                                            share|improve this answer



                                            share|improve this answer










                                            answered Jun 17 '15 at 10:58









                                            Maythux

                                            50.1k32164214




                                            50.1k32164214






















                                                up vote
                                                7
                                                down vote













                                                Litteraly sticking to the question (file1, file2, outputfile with "has changed" message) the script below works.



                                                Copy the script into an empty file, save it as compare.py, make it executable, run it by the command:



                                                /path/to/compare.py <file1> <file2> <outputfile>


                                                The script:



                                                #!/usr/bin/env python

                                                import sys
                                                file1 = sys.argv[1]; file2 = sys.argv[2]; outfile = sys.argv[3]

                                                def readfile(file):
                                                with open(file) as compare:
                                                return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                data1 = readfile(file1); data2 = readfile(file2)
                                                mismatch = [item[0] for item in data1 if not item in data2]

                                                with open(outfile, "wt") as out:
                                                for line in mismatch:
                                                out.write(line+" has changed"+"n")


                                                With a few extra lines, you can make it either print to an outputfile, or to the terminal, depending on if the outputfile is defined:



                                                To print to a file:



                                                /path/to/compare.py <file1> <file2> <outputfile>


                                                To print to the terminal window:



                                                /path/to/compare.py <file1> <file2> 


                                                The script:



                                                #!/usr/bin/env python

                                                import sys

                                                file1 = sys.argv[1]; file2 = sys.argv[2]
                                                try:
                                                outfile = sys.argv[3]
                                                except IndexError:
                                                outfile = None

                                                def readfile(file):
                                                with open(file) as compare:
                                                return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                data1 = readfile(file1); data2 = readfile(file2)
                                                mismatch = [item[0] for item in data1 if not item in data2]

                                                if outfile != None:
                                                with open(outfile, "wt") as out:
                                                for line in mismatch:
                                                out.write(line+" has changed"+"n")
                                                else:
                                                for line in mismatch:
                                                print line+" has changed"





                                                share|improve this answer



























                                                  up vote
                                                  7
                                                  down vote













                                                  Litteraly sticking to the question (file1, file2, outputfile with "has changed" message) the script below works.



                                                  Copy the script into an empty file, save it as compare.py, make it executable, run it by the command:



                                                  /path/to/compare.py <file1> <file2> <outputfile>


                                                  The script:



                                                  #!/usr/bin/env python

                                                  import sys
                                                  file1 = sys.argv[1]; file2 = sys.argv[2]; outfile = sys.argv[3]

                                                  def readfile(file):
                                                  with open(file) as compare:
                                                  return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                  data1 = readfile(file1); data2 = readfile(file2)
                                                  mismatch = [item[0] for item in data1 if not item in data2]

                                                  with open(outfile, "wt") as out:
                                                  for line in mismatch:
                                                  out.write(line+" has changed"+"n")


                                                  With a few extra lines, you can make it either print to an outputfile, or to the terminal, depending on if the outputfile is defined:



                                                  To print to a file:



                                                  /path/to/compare.py <file1> <file2> <outputfile>


                                                  To print to the terminal window:



                                                  /path/to/compare.py <file1> <file2> 


                                                  The script:



                                                  #!/usr/bin/env python

                                                  import sys

                                                  file1 = sys.argv[1]; file2 = sys.argv[2]
                                                  try:
                                                  outfile = sys.argv[3]
                                                  except IndexError:
                                                  outfile = None

                                                  def readfile(file):
                                                  with open(file) as compare:
                                                  return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                  data1 = readfile(file1); data2 = readfile(file2)
                                                  mismatch = [item[0] for item in data1 if not item in data2]

                                                  if outfile != None:
                                                  with open(outfile, "wt") as out:
                                                  for line in mismatch:
                                                  out.write(line+" has changed"+"n")
                                                  else:
                                                  for line in mismatch:
                                                  print line+" has changed"





                                                  share|improve this answer

























                                                    up vote
                                                    7
                                                    down vote










                                                    up vote
                                                    7
                                                    down vote









                                                    Litteraly sticking to the question (file1, file2, outputfile with "has changed" message) the script below works.



                                                    Copy the script into an empty file, save it as compare.py, make it executable, run it by the command:



                                                    /path/to/compare.py <file1> <file2> <outputfile>


                                                    The script:



                                                    #!/usr/bin/env python

                                                    import sys
                                                    file1 = sys.argv[1]; file2 = sys.argv[2]; outfile = sys.argv[3]

                                                    def readfile(file):
                                                    with open(file) as compare:
                                                    return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                    data1 = readfile(file1); data2 = readfile(file2)
                                                    mismatch = [item[0] for item in data1 if not item in data2]

                                                    with open(outfile, "wt") as out:
                                                    for line in mismatch:
                                                    out.write(line+" has changed"+"n")


                                                    With a few extra lines, you can make it either print to an outputfile, or to the terminal, depending on if the outputfile is defined:



                                                    To print to a file:



                                                    /path/to/compare.py <file1> <file2> <outputfile>


                                                    To print to the terminal window:



                                                    /path/to/compare.py <file1> <file2> 


                                                    The script:



                                                    #!/usr/bin/env python

                                                    import sys

                                                    file1 = sys.argv[1]; file2 = sys.argv[2]
                                                    try:
                                                    outfile = sys.argv[3]
                                                    except IndexError:
                                                    outfile = None

                                                    def readfile(file):
                                                    with open(file) as compare:
                                                    return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                    data1 = readfile(file1); data2 = readfile(file2)
                                                    mismatch = [item[0] for item in data1 if not item in data2]

                                                    if outfile != None:
                                                    with open(outfile, "wt") as out:
                                                    for line in mismatch:
                                                    out.write(line+" has changed"+"n")
                                                    else:
                                                    for line in mismatch:
                                                    print line+" has changed"





                                                    share|improve this answer














                                                    Litteraly sticking to the question (file1, file2, outputfile with "has changed" message) the script below works.



                                                    Copy the script into an empty file, save it as compare.py, make it executable, run it by the command:



                                                    /path/to/compare.py <file1> <file2> <outputfile>


                                                    The script:



                                                    #!/usr/bin/env python

                                                    import sys
                                                    file1 = sys.argv[1]; file2 = sys.argv[2]; outfile = sys.argv[3]

                                                    def readfile(file):
                                                    with open(file) as compare:
                                                    return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                    data1 = readfile(file1); data2 = readfile(file2)
                                                    mismatch = [item[0] for item in data1 if not item in data2]

                                                    with open(outfile, "wt") as out:
                                                    for line in mismatch:
                                                    out.write(line+" has changed"+"n")


                                                    With a few extra lines, you can make it either print to an outputfile, or to the terminal, depending on if the outputfile is defined:



                                                    To print to a file:



                                                    /path/to/compare.py <file1> <file2> <outputfile>


                                                    To print to the terminal window:



                                                    /path/to/compare.py <file1> <file2> 


                                                    The script:



                                                    #!/usr/bin/env python

                                                    import sys

                                                    file1 = sys.argv[1]; file2 = sys.argv[2]
                                                    try:
                                                    outfile = sys.argv[3]
                                                    except IndexError:
                                                    outfile = None

                                                    def readfile(file):
                                                    with open(file) as compare:
                                                    return [item.replace("n", "").split(" ") for item in compare.readlines()]

                                                    data1 = readfile(file1); data2 = readfile(file2)
                                                    mismatch = [item[0] for item in data1 if not item in data2]

                                                    if outfile != None:
                                                    with open(outfile, "wt") as out:
                                                    for line in mismatch:
                                                    out.write(line+" has changed"+"n")
                                                    else:
                                                    for line in mismatch:
                                                    print line+" has changed"






                                                    share|improve this answer














                                                    share|improve this answer



                                                    share|improve this answer








                                                    edited Aug 26 '14 at 7:09

























                                                    answered Aug 25 '14 at 18:34









                                                    Jacob Vlijm

                                                    63.2k9122216




                                                    63.2k9122216






















                                                        up vote
                                                        4
                                                        down vote













                                                        An easy way is to use colordiff, which behaves like diff but colorizes its output. This is very helpful for reading diffs. Using your example,



                                                        $ colordiff -u File_1.txt File_2.txt
                                                        --- File_1.txt 2016-12-24 17:59:17.409490554 -0500
                                                        +++ File_2.txt 2016-12-24 18:00:06.666719659 -0500
                                                        @@ -1,3 +1,3 @@
                                                        User1 US
                                                        User2 US
                                                        -User3 US
                                                        +User3 NG


                                                        where the u option gives a unified diff. This is how the colorized diff looks like:



                                                        enter image description here



                                                        Install colordiff by running sudo apt-get install colordiff.






                                                        share|improve this answer



















                                                        • 1




                                                          If you want colors I find the diff built into vim to actually be easy to use, as in the answer by Mr.S
                                                          – thomasrutter
                                                          Nov 30 at 5:37















                                                        up vote
                                                        4
                                                        down vote













                                                        An easy way is to use colordiff, which behaves like diff but colorizes its output. This is very helpful for reading diffs. Using your example,



                                                        $ colordiff -u File_1.txt File_2.txt
                                                        --- File_1.txt 2016-12-24 17:59:17.409490554 -0500
                                                        +++ File_2.txt 2016-12-24 18:00:06.666719659 -0500
                                                        @@ -1,3 +1,3 @@
                                                        User1 US
                                                        User2 US
                                                        -User3 US
                                                        +User3 NG


                                                        where the u option gives a unified diff. This is how the colorized diff looks like:



                                                        enter image description here



                                                        Install colordiff by running sudo apt-get install colordiff.






                                                        share|improve this answer



















                                                        • 1




                                                          If you want colors I find the diff built into vim to actually be easy to use, as in the answer by Mr.S
                                                          – thomasrutter
                                                          Nov 30 at 5:37













                                                        up vote
                                                        4
                                                        down vote










                                                        up vote
                                                        4
                                                        down vote









                                                        An easy way is to use colordiff, which behaves like diff but colorizes its output. This is very helpful for reading diffs. Using your example,



                                                        $ colordiff -u File_1.txt File_2.txt
                                                        --- File_1.txt 2016-12-24 17:59:17.409490554 -0500
                                                        +++ File_2.txt 2016-12-24 18:00:06.666719659 -0500
                                                        @@ -1,3 +1,3 @@
                                                        User1 US
                                                        User2 US
                                                        -User3 US
                                                        +User3 NG


                                                        where the u option gives a unified diff. This is how the colorized diff looks like:



                                                        enter image description here



                                                        Install colordiff by running sudo apt-get install colordiff.






                                                        share|improve this answer














                                                        An easy way is to use colordiff, which behaves like diff but colorizes its output. This is very helpful for reading diffs. Using your example,



                                                        $ colordiff -u File_1.txt File_2.txt
                                                        --- File_1.txt 2016-12-24 17:59:17.409490554 -0500
                                                        +++ File_2.txt 2016-12-24 18:00:06.666719659 -0500
                                                        @@ -1,3 +1,3 @@
                                                        User1 US
                                                        User2 US
                                                        -User3 US
                                                        +User3 NG


                                                        where the u option gives a unified diff. This is how the colorized diff looks like:



                                                        enter image description here



                                                        Install colordiff by running sudo apt-get install colordiff.







                                                        share|improve this answer














                                                        share|improve this answer



                                                        share|improve this answer








                                                        edited Dec 24 '16 at 23:32

























                                                        answered Dec 24 '16 at 23:05









                                                        edwinksl

                                                        16.5k115385




                                                        16.5k115385








                                                        • 1




                                                          If you want colors I find the diff built into vim to actually be easy to use, as in the answer by Mr.S
                                                          – thomasrutter
                                                          Nov 30 at 5:37














                                                        • 1




                                                          If you want colors I find the diff built into vim to actually be easy to use, as in the answer by Mr.S
                                                          – thomasrutter
                                                          Nov 30 at 5:37








                                                        1




                                                        1




                                                        If you want colors I find the diff built into vim to actually be easy to use, as in the answer by Mr.S
                                                        – thomasrutter
                                                        Nov 30 at 5:37




                                                        If you want colors I find the diff built into vim to actually be easy to use, as in the answer by Mr.S
                                                        – thomasrutter
                                                        Nov 30 at 5:37










                                                        up vote
                                                        2
                                                        down vote













                                                        Additional answer



                                                        If there's no need to know what parts of the files differ, you can use checksum of the file. There's many ways to do that, using md5sum or sha256sum. Basically , each of them outputs a string to which a file contents hash. If the two files are the same, their hash will be the same as well. This is often used when you download software, such as Ubuntu installation iso images. They're often used for verifying integrity of a downloaded content.



                                                        Consider script below, where you can give two files as arguments, and the file will tell you if they are the same or not.



                                                        #!/bin/bash

                                                        # Check if both files exist
                                                        if ! [ -e "$1" ];
                                                        then
                                                        printf "%s doesn't existn" "$1"
                                                        exit 2
                                                        elif ! [ -e "$2" ]
                                                        then
                                                        printf "%s doesn't existn" "$2"
                                                        exit 2
                                                        fi

                                                        # Get checksums of eithe file
                                                        file1_sha=$( sha256sum "$1" | awk '{print $1}')
                                                        file2_sha=$( sha256sum "$2" | awk '{print $1}')

                                                        # Compare the checksums
                                                        if [ "x$file1_sha" = "x$file2_sha" ]
                                                        then
                                                        printf "Files %s and %s are the samen" "$1" "$2"
                                                        exit 0
                                                        else
                                                        printf "Files %s and %s are differentn" "$1" "$2"
                                                        exit 1
                                                        fi


                                                        Sample run:



                                                        $ ./compare_files.sh /etc/passwd ./passwd_copy.txt                                                                
                                                        Files /etc/passwd and ./passwd_copy.txt are the same
                                                        $ echo $?
                                                        0
                                                        $ ./compare_files.sh /etc/passwd /etc/default/grub
                                                        Files /etc/passwd and /etc/default/grub are different
                                                        $ echo $?
                                                        1


                                                        Older answer



                                                        In addition there is comm command, which compares two sorted files, and gives output in 3 colums : column 1 for items unique to file #1, column 2 for items unique to file #2, and column 3 for items present in both files.



                                                        To suppress either column you can use switches -1, -2 , and -3. Using -3 will show the lines that differ.



                                                        Bellow you can see the screenshot of the command in action.



                                                        enter image description here



                                                        There is just one requirement - the files must be sorted for them to be compared properly. sort command can be used for that purpose. Bellow is another screenshot , where files are sorted and then compared. Lines starting on the left bellong to File_1 only , lines starting on column 2 belong to File_2 only



                                                        enter image description here






                                                        share|improve this answer























                                                        • @DavidFoerster it's kinda hard to be editing on mobile :) Done now, though
                                                          – Sergiy Kolodyazhnyy
                                                          May 23 '15 at 9:17















                                                        up vote
                                                        2
                                                        down vote













                                                        Additional answer



                                                        If there's no need to know what parts of the files differ, you can use checksum of the file. There's many ways to do that, using md5sum or sha256sum. Basically , each of them outputs a string to which a file contents hash. If the two files are the same, their hash will be the same as well. This is often used when you download software, such as Ubuntu installation iso images. They're often used for verifying integrity of a downloaded content.



                                                        Consider script below, where you can give two files as arguments, and the file will tell you if they are the same or not.



                                                        #!/bin/bash

                                                        # Check if both files exist
                                                        if ! [ -e "$1" ];
                                                        then
                                                        printf "%s doesn't existn" "$1"
                                                        exit 2
                                                        elif ! [ -e "$2" ]
                                                        then
                                                        printf "%s doesn't existn" "$2"
                                                        exit 2
                                                        fi

                                                        # Get checksums of eithe file
                                                        file1_sha=$( sha256sum "$1" | awk '{print $1}')
                                                        file2_sha=$( sha256sum "$2" | awk '{print $1}')

                                                        # Compare the checksums
                                                        if [ "x$file1_sha" = "x$file2_sha" ]
                                                        then
                                                        printf "Files %s and %s are the samen" "$1" "$2"
                                                        exit 0
                                                        else
                                                        printf "Files %s and %s are differentn" "$1" "$2"
                                                        exit 1
                                                        fi


                                                        Sample run:



                                                        $ ./compare_files.sh /etc/passwd ./passwd_copy.txt                                                                
                                                        Files /etc/passwd and ./passwd_copy.txt are the same
                                                        $ echo $?
                                                        0
                                                        $ ./compare_files.sh /etc/passwd /etc/default/grub
                                                        Files /etc/passwd and /etc/default/grub are different
                                                        $ echo $?
                                                        1


                                                        Older answer



                                                        In addition there is comm command, which compares two sorted files, and gives output in 3 colums : column 1 for items unique to file #1, column 2 for items unique to file #2, and column 3 for items present in both files.



                                                        To suppress either column you can use switches -1, -2 , and -3. Using -3 will show the lines that differ.



                                                        Bellow you can see the screenshot of the command in action.



                                                        enter image description here



                                                        There is just one requirement - the files must be sorted for them to be compared properly. sort command can be used for that purpose. Bellow is another screenshot , where files are sorted and then compared. Lines starting on the left bellong to File_1 only , lines starting on column 2 belong to File_2 only



                                                        enter image description here






                                                        share|improve this answer























                                                        • @DavidFoerster it's kinda hard to be editing on mobile :) Done now, though
                                                          – Sergiy Kolodyazhnyy
                                                          May 23 '15 at 9:17













                                                        up vote
                                                        2
                                                        down vote










                                                        up vote
                                                        2
                                                        down vote









                                                        Additional answer



                                                        If there's no need to know what parts of the files differ, you can use checksum of the file. There's many ways to do that, using md5sum or sha256sum. Basically , each of them outputs a string to which a file contents hash. If the two files are the same, their hash will be the same as well. This is often used when you download software, such as Ubuntu installation iso images. They're often used for verifying integrity of a downloaded content.



                                                        Consider script below, where you can give two files as arguments, and the file will tell you if they are the same or not.



                                                        #!/bin/bash

                                                        # Check if both files exist
                                                        if ! [ -e "$1" ];
                                                        then
                                                        printf "%s doesn't existn" "$1"
                                                        exit 2
                                                        elif ! [ -e "$2" ]
                                                        then
                                                        printf "%s doesn't existn" "$2"
                                                        exit 2
                                                        fi

                                                        # Get checksums of eithe file
                                                        file1_sha=$( sha256sum "$1" | awk '{print $1}')
                                                        file2_sha=$( sha256sum "$2" | awk '{print $1}')

                                                        # Compare the checksums
                                                        if [ "x$file1_sha" = "x$file2_sha" ]
                                                        then
                                                        printf "Files %s and %s are the samen" "$1" "$2"
                                                        exit 0
                                                        else
                                                        printf "Files %s and %s are differentn" "$1" "$2"
                                                        exit 1
                                                        fi


                                                        Sample run:



                                                        $ ./compare_files.sh /etc/passwd ./passwd_copy.txt                                                                
                                                        Files /etc/passwd and ./passwd_copy.txt are the same
                                                        $ echo $?
                                                        0
                                                        $ ./compare_files.sh /etc/passwd /etc/default/grub
                                                        Files /etc/passwd and /etc/default/grub are different
                                                        $ echo $?
                                                        1


                                                        Older answer



                                                        In addition there is comm command, which compares two sorted files, and gives output in 3 colums : column 1 for items unique to file #1, column 2 for items unique to file #2, and column 3 for items present in both files.



                                                        To suppress either column you can use switches -1, -2 , and -3. Using -3 will show the lines that differ.



                                                        Bellow you can see the screenshot of the command in action.



                                                        enter image description here



                                                        There is just one requirement - the files must be sorted for them to be compared properly. sort command can be used for that purpose. Bellow is another screenshot , where files are sorted and then compared. Lines starting on the left bellong to File_1 only , lines starting on column 2 belong to File_2 only



                                                        enter image description here






                                                        share|improve this answer














                                                        Additional answer



                                                        If there's no need to know what parts of the files differ, you can use checksum of the file. There's many ways to do that, using md5sum or sha256sum. Basically , each of them outputs a string to which a file contents hash. If the two files are the same, their hash will be the same as well. This is often used when you download software, such as Ubuntu installation iso images. They're often used for verifying integrity of a downloaded content.



                                                        Consider script below, where you can give two files as arguments, and the file will tell you if they are the same or not.



                                                        #!/bin/bash

                                                        # Check if both files exist
                                                        if ! [ -e "$1" ];
                                                        then
                                                        printf "%s doesn't existn" "$1"
                                                        exit 2
                                                        elif ! [ -e "$2" ]
                                                        then
                                                        printf "%s doesn't existn" "$2"
                                                        exit 2
                                                        fi

                                                        # Get checksums of eithe file
                                                        file1_sha=$( sha256sum "$1" | awk '{print $1}')
                                                        file2_sha=$( sha256sum "$2" | awk '{print $1}')

                                                        # Compare the checksums
                                                        if [ "x$file1_sha" = "x$file2_sha" ]
                                                        then
                                                        printf "Files %s and %s are the samen" "$1" "$2"
                                                        exit 0
                                                        else
                                                        printf "Files %s and %s are differentn" "$1" "$2"
                                                        exit 1
                                                        fi


                                                        Sample run:



                                                        $ ./compare_files.sh /etc/passwd ./passwd_copy.txt                                                                
                                                        Files /etc/passwd and ./passwd_copy.txt are the same
                                                        $ echo $?
                                                        0
                                                        $ ./compare_files.sh /etc/passwd /etc/default/grub
                                                        Files /etc/passwd and /etc/default/grub are different
                                                        $ echo $?
                                                        1


                                                        Older answer



                                                        In addition there is comm command, which compares two sorted files, and gives output in 3 colums : column 1 for items unique to file #1, column 2 for items unique to file #2, and column 3 for items present in both files.



                                                        To suppress either column you can use switches -1, -2 , and -3. Using -3 will show the lines that differ.



                                                        Bellow you can see the screenshot of the command in action.



                                                        enter image description here



                                                        There is just one requirement - the files must be sorted for them to be compared properly. sort command can be used for that purpose. Bellow is another screenshot , where files are sorted and then compared. Lines starting on the left bellong to File_1 only , lines starting on column 2 belong to File_2 only



                                                        enter image description here







                                                        share|improve this answer














                                                        share|improve this answer



                                                        share|improve this answer








                                                        edited Dec 24 '16 at 22:22

























                                                        answered May 22 '15 at 19:51









                                                        Sergiy Kolodyazhnyy

                                                        68.9k9143303




                                                        68.9k9143303












                                                        • @DavidFoerster it's kinda hard to be editing on mobile :) Done now, though
                                                          – Sergiy Kolodyazhnyy
                                                          May 23 '15 at 9:17


















                                                        • @DavidFoerster it's kinda hard to be editing on mobile :) Done now, though
                                                          – Sergiy Kolodyazhnyy
                                                          May 23 '15 at 9:17
















                                                        @DavidFoerster it's kinda hard to be editing on mobile :) Done now, though
                                                        – Sergiy Kolodyazhnyy
                                                        May 23 '15 at 9:17




                                                        @DavidFoerster it's kinda hard to be editing on mobile :) Done now, though
                                                        – Sergiy Kolodyazhnyy
                                                        May 23 '15 at 9:17










                                                        up vote
                                                        1
                                                        down vote













                                                        Install git and use



                                                        $ git diff filename1 filename2


                                                        And you will get output in nice colored format



                                                        Git installation



                                                        $ apt-get update
                                                        $ apt-get install git-core





                                                        share|improve this answer

























                                                          up vote
                                                          1
                                                          down vote













                                                          Install git and use



                                                          $ git diff filename1 filename2


                                                          And you will get output in nice colored format



                                                          Git installation



                                                          $ apt-get update
                                                          $ apt-get install git-core





                                                          share|improve this answer























                                                            up vote
                                                            1
                                                            down vote










                                                            up vote
                                                            1
                                                            down vote









                                                            Install git and use



                                                            $ git diff filename1 filename2


                                                            And you will get output in nice colored format



                                                            Git installation



                                                            $ apt-get update
                                                            $ apt-get install git-core





                                                            share|improve this answer












                                                            Install git and use



                                                            $ git diff filename1 filename2


                                                            And you will get output in nice colored format



                                                            Git installation



                                                            $ apt-get update
                                                            $ apt-get install git-core






                                                            share|improve this answer












                                                            share|improve this answer



                                                            share|improve this answer










                                                            answered Nov 9 at 14:24









                                                            Eric Korolev

                                                            658




                                                            658






















                                                                up vote
                                                                1
                                                                down vote













                                                                colcmp.sh



                                                                Compares name/value pairs in 2 files in the format name valuen. Writes the name to Output_file if changed. Requires bash v4+ for associative arrays.



                                                                Usage



                                                                $ ./colcmp.sh File_1.txt File_2.txt
                                                                User3 changed from 'US' to 'NG'
                                                                no change: User1,User2


                                                                Output_File



                                                                $ cat Output_File
                                                                User3 has changed


                                                                Source (colcmp.sh)



                                                                cmp -s "$1" "$2"
                                                                case "$?" in
                                                                0)
                                                                echo "" > Output_File
                                                                echo "files are identical"
                                                                ;;
                                                                1)
                                                                echo "" > Output_File
                                                                cp "$1" ~/.colcmp.array1.tmp.sh
                                                                sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh
                                                                sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array1.tmp.sh
                                                                sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.array1.tmp.sh
                                                                chmod 755 ~/.colcmp.array1.tmp.sh
                                                                declare -A A1
                                                                source ~/.colcmp.array1.tmp.sh

                                                                cp "$2" ~/.colcmp.array2.tmp.sh
                                                                sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                chmod 755 ~/.colcmp.array2.tmp.sh
                                                                declare -A A2
                                                                source ~/.colcmp.array2.tmp.sh

                                                                USERSWHODIDNOTCHANGE=
                                                                for i in "${!A1[@]}"; do
                                                                if [ "${A2[$i]+x}" = "" ]; then
                                                                echo "$i was removed"
                                                                echo "$i has changed" > Output_File
                                                                fi
                                                                done
                                                                for i in "${!A2[@]}"; do
                                                                if [ "${A1[$i]+x}" = "" ]; then
                                                                echo "$i was added as '${A2[$i]}'"
                                                                echo "$i has changed" > Output_File
                                                                elif [ "${A1[$i]}" != "${A2[$i]}" ]; then
                                                                echo "$i changed from '${A1[$i]}' to '${A2[$i]}'"
                                                                echo "$i has changed" > Output_File
                                                                else
                                                                if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                fi
                                                                USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"
                                                                fi
                                                                done
                                                                if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                echo "no change: $USERSWHODIDNOTCHANGE"
                                                                fi
                                                                ;;
                                                                *)
                                                                echo "error: file not found, access denied, etc..."
                                                                echo "usage: ./colcmp.sh File_1.txt File_2.txt"
                                                                ;;
                                                                esac




                                                                Explanation



                                                                Breakdown of the code and what it means, to the best of my understanding. I welcome edits and suggestions.



                                                                Basic File Compare



                                                                cmp -s "$1" "$2"
                                                                case "$?" in
                                                                0)
                                                                # match
                                                                ;;
                                                                1)
                                                                # compare
                                                                ;;
                                                                *)
                                                                # error
                                                                ;;
                                                                esac


                                                                cmp will set the value of $? as follows:




                                                                • 0 = files match

                                                                • 1 = files differ

                                                                • 2 = error


                                                                I chose to use a case..esac statement to evalute $? because the value of $? changes after every command, including test ([).



                                                                Alternatively I could have used a variable to hold the value of $?:



                                                                cmp -s "$1" "$2"
                                                                CMPRESULT=$?
                                                                if [ $CMPRESULT -eq 0 ]; then
                                                                # match
                                                                elif [ $CMPRESULT -eq 1 ]; then
                                                                # compare
                                                                else
                                                                # error
                                                                fi


                                                                Above does the same thing as the case statement. IDK which I like better.



                                                                Clear the Output



                                                                        echo "" > Output_File


                                                                Above clears the output file so if no users changed, the output file will be empty.



                                                                I do this inside the case statements so that the Output_file remains unchanged on error.



                                                                Copy User File to Shell Script



                                                                        cp "$1" ~/.colcmp.arrays.tmp.sh


                                                                Above copies File_1.txt to the current user's home dir.



                                                                For example, if the current user is john, the above would be the same as cp "File_1.txt" /home/john/.colcmp.arrays.tmp.sh



                                                                Escape Special Characters



                                                                Basically, I'm paranoid. I know that these characters could have special meaning or execute an external program when run in a script as part of variable assignment:




                                                                • ` - back-tick - executes a program and the output as if the output were part of your script

                                                                • $ - dollar sign - usually prefixes a variable

                                                                • ${} - allows for more complex variable substitution

                                                                • $() - idk what this does but i think it can execute code


                                                                What I don't know is how much I don't know about bash. I don't know what other characters might have special meaning, but I want to escape them all with a backslash:



                                                                        sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh


                                                                sed can do a lot more than regular expression pattern matching. The script pattern "s/(find)/(replace)/" specifically performs the pattern match.



                                                                "s/(find)/(replace)/(modifiers)"




                                                                • (find) = ([^A-Za-z0-9 ])


                                                                  • () = capture group 1

                                                                  • = match a character from a specific list of characters

                                                                  • [^] = match any character NOT in a specific list of characters

                                                                  • [^A-Za-z0-9 ] = match any character that is NOT a letter, digit or space




                                                                in english: capture any punctuation or special character as caputure group 1 (\1)




                                                                • (replace) = \\\1


                                                                  • \\ = literal character (\) i.e. a backslash

                                                                  • \1 = capture group 1




                                                                in english: prefix all special characters with a backslash




                                                                • (modifiers) = g


                                                                  • g = globally replace




                                                                in english: if more than one match is found on the same line, replace them all



                                                                Comment Out the Entire Script



                                                                        sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.arrays.tmp.sh


                                                                Above uses a regular expression to prefix every line of ~/.colcmp.arrays.tmp.sh with a bash comment character (#). I do this because later I intend to execute ~/.colcmp.arrays.tmp.sh using the source command and because I don't know for sure the whole format of File_1.txt.



                                                                I don't want to accidentally execute arbitrary code. I don't think anyone does.



                                                                "s/(find)/(replace)/"




                                                                • (find) = ^(.*)$


                                                                  • ^ = beginning of a line

                                                                  • () = capture group 1

                                                                  • .* = anything

                                                                  • $ = end of line




                                                                in english: capture each line as caputure group 1 (\1)




                                                                • (replace) = #\1


                                                                  • # = literal character (#) i.e. a pound symbol or hash

                                                                  • \1 = capture group 1




                                                                in english: replace each line with a pound symbol followed by the line that was replaced



                                                                Convert User Value to A1[User]="value"



                                                                        sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.arrays.tmp.sh


                                                                Above is the core of this script.




                                                                • convert this: #User1 US


                                                                  • to this: A1[User1]="US"

                                                                  • or this: A2[User1]="US" (for the 2nd file)




                                                                "s/(find)/(replace)/"




                                                                • (find) = ^#\s*(\S+)\s+(\S.?)\s$


                                                                  • ^ = beginning of a line

                                                                  • # = literal character (#) i.e. a pound symbol or hash

                                                                  • \s* - zero or more whitespace characters

                                                                  • () = capture group 1

                                                                  • \S+ - one or more NON-whitespace characters

                                                                  • \s+ - one or more whitespace characters

                                                                  • () = capture group 2

                                                                  • \S - exactly one NON-whitespace character

                                                                  • .*? = anything, non-greedy

                                                                  • \s* - zero or more whitespace characters

                                                                  • $ = end of line




                                                                in english:




                                                                • require but ignore leading comment characters (#)

                                                                • ignore leading whitespace

                                                                • capture the first word as caputure group 1 (\1)

                                                                • require a space (or tab, or whitespace)


                                                                  • that will be replaced with an equals sign because

                                                                  • it's not part of any capture group, and because

                                                                  • the (replace) pattern puts an equals sign between capture group 1 and capture group 2



                                                                • capture the rest of the line as capture group 2



                                                                • (replace) = A1\[\1\]="\2"




                                                                  • A1\[ - literal characters A1[ to start array assignment in an array called A1

                                                                  • \1 = capture group 1 - which does not include the leading hash (#) and does not include leading whitespace - in this case capture group 1 is being used to set the name of the name/value pair in the bash associative array.

                                                                  • \]=" = literal characters ]="



                                                                    • ] = close array assignment e.g. A1[User1]="US"


                                                                    • = = assignment operator e.g. variable=value


                                                                    • " = quote value to capture spaces ... although now that i think about it, it would have been easier to let the code above that backslashes everything to also backslash space characters.



                                                                  • \1 = capture group 2 - in this case, the value of the name/value pair

                                                                  • " = closing quote value to capture spaces




                                                                in english: replace each line in the format #name value with an array assignment operator in the format A1[name]="value"



                                                                Make Executable



                                                                        chmod 755 ~/.colcmp.arrays.tmp.sh


                                                                Above uses chmod to make the array script file executable.



                                                                I'm not sure if this is necessary.



                                                                Declare Associative Array (bash v4+)



                                                                        declare -A A1


                                                                The capital -A indicates that the variables declared will be associative arrays.



                                                                This is why the script requires bash v4 or greater.



                                                                Execute our Array Variable Assignment Script



                                                                        source ~/.colcmp.arrays.tmp.sh


                                                                We have already:




                                                                • converted our file from lines of User value to lines of A1[User]="value",

                                                                • made it executable (maybe), and

                                                                • declared A1 as an associative array...


                                                                Above we source the script to run it in the current shell. We do this so we can keep the variable values that get set by the script. If you execute the script directly, it spawns a new shell, and the variable values are lost when the new shell exits, or at least that's my understanding.



                                                                This Should Be a Function



                                                                        cp "$2" ~/.colcmp.array2.tmp.sh
                                                                sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                chmod 755 ~/.colcmp.array2.tmp.sh
                                                                declare -A A2
                                                                source ~/.colcmp.array2.tmp.sh


                                                                We do the same thing for $1 and A1 that we do for $2 and A2. It really should be a function. I think at this point this script is confusing enough and it works, so I'm not gonna fix it.



                                                                Detect Users Removed



                                                                        for i in "${!A1[@]}"; do
                                                                # check for users removed
                                                                done


                                                                Above loops through associative array keys



                                                                            if [ "${A2[$i]+x}" = "" ]; then


                                                                Above uses variable substitution to detect the difference between a value that is unset vs a variable that has been explicitly set to a zero length string.



                                                                Apparently, there are a lot of ways to see if a variable has been set. I chose the one with the most votes.



                                                                                echo "$i has changed" > Output_File


                                                                Above adds the user $i to the Output_File



                                                                Detect Users Added or Changed



                                                                        USERSWHODIDNOTCHANGE=


                                                                Above clears a variable so we can keep track of users that did not change.



                                                                        for i in "${!A2[@]}"; do
                                                                # detect users added, changed and not changed
                                                                done


                                                                Above loops through associative array keys



                                                                            if ! [ "${A1[$i]+x}" != "" ]; then


                                                                Above uses variable substitution to see if a variable has been set.



                                                                                echo "$i was added as '${A2[$i]}'"


                                                                Because $i is the array key (user name) $A2[$i] should return the value associated with the current user from File_2.txt.



                                                                For example, if $i is User1, the above reads as ${A2[User1]}



                                                                                echo "$i has changed" > Output_File


                                                                Above adds the user $i to the Output_File



                                                                            elif [ "${A1[$i]}" != "${A2[$i]}" ]; then


                                                                Because $i is the array key (user name) $A1[$i] should return the value associated with the current user from File_1.txt, and $A2[$i] should return the value from File_2.txt.



                                                                Above compares the associated values for user $i from both files..



                                                                                echo "$i has changed" > Output_File


                                                                Above adds the user $i to the Output_File



                                                                                if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                fi
                                                                USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"


                                                                Above creates a comma separated list of users who did not change. Note there are no spaces in the list, or else the next check would need to be quoted.



                                                                        if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                echo "no change: $USERSWHODIDNOTCHANGE"
                                                                fi


                                                                Above reports the value of $USERSWHODIDNOTCHANGE but only if there is a value in $USERSWHODIDNOTCHANGE. The way this is written, $USERSWHODIDNOTCHANGE cannot contain any spaces. If it does need spaces, above could be rewritten as follows:



                                                                        if [ "$USERSWHODIDNOTCHANGE" != "" ]; then
                                                                echo "no change: $USERSWHODIDNOTCHANGE"
                                                                fi





                                                                share|improve this answer

























                                                                  up vote
                                                                  1
                                                                  down vote













                                                                  colcmp.sh



                                                                  Compares name/value pairs in 2 files in the format name valuen. Writes the name to Output_file if changed. Requires bash v4+ for associative arrays.



                                                                  Usage



                                                                  $ ./colcmp.sh File_1.txt File_2.txt
                                                                  User3 changed from 'US' to 'NG'
                                                                  no change: User1,User2


                                                                  Output_File



                                                                  $ cat Output_File
                                                                  User3 has changed


                                                                  Source (colcmp.sh)



                                                                  cmp -s "$1" "$2"
                                                                  case "$?" in
                                                                  0)
                                                                  echo "" > Output_File
                                                                  echo "files are identical"
                                                                  ;;
                                                                  1)
                                                                  echo "" > Output_File
                                                                  cp "$1" ~/.colcmp.array1.tmp.sh
                                                                  sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh
                                                                  sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array1.tmp.sh
                                                                  sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.array1.tmp.sh
                                                                  chmod 755 ~/.colcmp.array1.tmp.sh
                                                                  declare -A A1
                                                                  source ~/.colcmp.array1.tmp.sh

                                                                  cp "$2" ~/.colcmp.array2.tmp.sh
                                                                  sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                  sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                  sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                  chmod 755 ~/.colcmp.array2.tmp.sh
                                                                  declare -A A2
                                                                  source ~/.colcmp.array2.tmp.sh

                                                                  USERSWHODIDNOTCHANGE=
                                                                  for i in "${!A1[@]}"; do
                                                                  if [ "${A2[$i]+x}" = "" ]; then
                                                                  echo "$i was removed"
                                                                  echo "$i has changed" > Output_File
                                                                  fi
                                                                  done
                                                                  for i in "${!A2[@]}"; do
                                                                  if [ "${A1[$i]+x}" = "" ]; then
                                                                  echo "$i was added as '${A2[$i]}'"
                                                                  echo "$i has changed" > Output_File
                                                                  elif [ "${A1[$i]}" != "${A2[$i]}" ]; then
                                                                  echo "$i changed from '${A1[$i]}' to '${A2[$i]}'"
                                                                  echo "$i has changed" > Output_File
                                                                  else
                                                                  if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                  USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                  fi
                                                                  USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"
                                                                  fi
                                                                  done
                                                                  if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                  echo "no change: $USERSWHODIDNOTCHANGE"
                                                                  fi
                                                                  ;;
                                                                  *)
                                                                  echo "error: file not found, access denied, etc..."
                                                                  echo "usage: ./colcmp.sh File_1.txt File_2.txt"
                                                                  ;;
                                                                  esac




                                                                  Explanation



                                                                  Breakdown of the code and what it means, to the best of my understanding. I welcome edits and suggestions.



                                                                  Basic File Compare



                                                                  cmp -s "$1" "$2"
                                                                  case "$?" in
                                                                  0)
                                                                  # match
                                                                  ;;
                                                                  1)
                                                                  # compare
                                                                  ;;
                                                                  *)
                                                                  # error
                                                                  ;;
                                                                  esac


                                                                  cmp will set the value of $? as follows:




                                                                  • 0 = files match

                                                                  • 1 = files differ

                                                                  • 2 = error


                                                                  I chose to use a case..esac statement to evalute $? because the value of $? changes after every command, including test ([).



                                                                  Alternatively I could have used a variable to hold the value of $?:



                                                                  cmp -s "$1" "$2"
                                                                  CMPRESULT=$?
                                                                  if [ $CMPRESULT -eq 0 ]; then
                                                                  # match
                                                                  elif [ $CMPRESULT -eq 1 ]; then
                                                                  # compare
                                                                  else
                                                                  # error
                                                                  fi


                                                                  Above does the same thing as the case statement. IDK which I like better.



                                                                  Clear the Output



                                                                          echo "" > Output_File


                                                                  Above clears the output file so if no users changed, the output file will be empty.



                                                                  I do this inside the case statements so that the Output_file remains unchanged on error.



                                                                  Copy User File to Shell Script



                                                                          cp "$1" ~/.colcmp.arrays.tmp.sh


                                                                  Above copies File_1.txt to the current user's home dir.



                                                                  For example, if the current user is john, the above would be the same as cp "File_1.txt" /home/john/.colcmp.arrays.tmp.sh



                                                                  Escape Special Characters



                                                                  Basically, I'm paranoid. I know that these characters could have special meaning or execute an external program when run in a script as part of variable assignment:




                                                                  • ` - back-tick - executes a program and the output as if the output were part of your script

                                                                  • $ - dollar sign - usually prefixes a variable

                                                                  • ${} - allows for more complex variable substitution

                                                                  • $() - idk what this does but i think it can execute code


                                                                  What I don't know is how much I don't know about bash. I don't know what other characters might have special meaning, but I want to escape them all with a backslash:



                                                                          sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh


                                                                  sed can do a lot more than regular expression pattern matching. The script pattern "s/(find)/(replace)/" specifically performs the pattern match.



                                                                  "s/(find)/(replace)/(modifiers)"




                                                                  • (find) = ([^A-Za-z0-9 ])


                                                                    • () = capture group 1

                                                                    • = match a character from a specific list of characters

                                                                    • [^] = match any character NOT in a specific list of characters

                                                                    • [^A-Za-z0-9 ] = match any character that is NOT a letter, digit or space




                                                                  in english: capture any punctuation or special character as caputure group 1 (\1)




                                                                  • (replace) = \\\1


                                                                    • \\ = literal character (\) i.e. a backslash

                                                                    • \1 = capture group 1




                                                                  in english: prefix all special characters with a backslash




                                                                  • (modifiers) = g


                                                                    • g = globally replace




                                                                  in english: if more than one match is found on the same line, replace them all



                                                                  Comment Out the Entire Script



                                                                          sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.arrays.tmp.sh


                                                                  Above uses a regular expression to prefix every line of ~/.colcmp.arrays.tmp.sh with a bash comment character (#). I do this because later I intend to execute ~/.colcmp.arrays.tmp.sh using the source command and because I don't know for sure the whole format of File_1.txt.



                                                                  I don't want to accidentally execute arbitrary code. I don't think anyone does.



                                                                  "s/(find)/(replace)/"




                                                                  • (find) = ^(.*)$


                                                                    • ^ = beginning of a line

                                                                    • () = capture group 1

                                                                    • .* = anything

                                                                    • $ = end of line




                                                                  in english: capture each line as caputure group 1 (\1)




                                                                  • (replace) = #\1


                                                                    • # = literal character (#) i.e. a pound symbol or hash

                                                                    • \1 = capture group 1




                                                                  in english: replace each line with a pound symbol followed by the line that was replaced



                                                                  Convert User Value to A1[User]="value"



                                                                          sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.arrays.tmp.sh


                                                                  Above is the core of this script.




                                                                  • convert this: #User1 US


                                                                    • to this: A1[User1]="US"

                                                                    • or this: A2[User1]="US" (for the 2nd file)




                                                                  "s/(find)/(replace)/"




                                                                  • (find) = ^#\s*(\S+)\s+(\S.?)\s$


                                                                    • ^ = beginning of a line

                                                                    • # = literal character (#) i.e. a pound symbol or hash

                                                                    • \s* - zero or more whitespace characters

                                                                    • () = capture group 1

                                                                    • \S+ - one or more NON-whitespace characters

                                                                    • \s+ - one or more whitespace characters

                                                                    • () = capture group 2

                                                                    • \S - exactly one NON-whitespace character

                                                                    • .*? = anything, non-greedy

                                                                    • \s* - zero or more whitespace characters

                                                                    • $ = end of line




                                                                  in english:




                                                                  • require but ignore leading comment characters (#)

                                                                  • ignore leading whitespace

                                                                  • capture the first word as caputure group 1 (\1)

                                                                  • require a space (or tab, or whitespace)


                                                                    • that will be replaced with an equals sign because

                                                                    • it's not part of any capture group, and because

                                                                    • the (replace) pattern puts an equals sign between capture group 1 and capture group 2



                                                                  • capture the rest of the line as capture group 2



                                                                  • (replace) = A1\[\1\]="\2"




                                                                    • A1\[ - literal characters A1[ to start array assignment in an array called A1

                                                                    • \1 = capture group 1 - which does not include the leading hash (#) and does not include leading whitespace - in this case capture group 1 is being used to set the name of the name/value pair in the bash associative array.

                                                                    • \]=" = literal characters ]="



                                                                      • ] = close array assignment e.g. A1[User1]="US"


                                                                      • = = assignment operator e.g. variable=value


                                                                      • " = quote value to capture spaces ... although now that i think about it, it would have been easier to let the code above that backslashes everything to also backslash space characters.



                                                                    • \1 = capture group 2 - in this case, the value of the name/value pair

                                                                    • " = closing quote value to capture spaces




                                                                  in english: replace each line in the format #name value with an array assignment operator in the format A1[name]="value"



                                                                  Make Executable



                                                                          chmod 755 ~/.colcmp.arrays.tmp.sh


                                                                  Above uses chmod to make the array script file executable.



                                                                  I'm not sure if this is necessary.



                                                                  Declare Associative Array (bash v4+)



                                                                          declare -A A1


                                                                  The capital -A indicates that the variables declared will be associative arrays.



                                                                  This is why the script requires bash v4 or greater.



                                                                  Execute our Array Variable Assignment Script



                                                                          source ~/.colcmp.arrays.tmp.sh


                                                                  We have already:




                                                                  • converted our file from lines of User value to lines of A1[User]="value",

                                                                  • made it executable (maybe), and

                                                                  • declared A1 as an associative array...


                                                                  Above we source the script to run it in the current shell. We do this so we can keep the variable values that get set by the script. If you execute the script directly, it spawns a new shell, and the variable values are lost when the new shell exits, or at least that's my understanding.



                                                                  This Should Be a Function



                                                                          cp "$2" ~/.colcmp.array2.tmp.sh
                                                                  sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                  sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                  sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                  chmod 755 ~/.colcmp.array2.tmp.sh
                                                                  declare -A A2
                                                                  source ~/.colcmp.array2.tmp.sh


                                                                  We do the same thing for $1 and A1 that we do for $2 and A2. It really should be a function. I think at this point this script is confusing enough and it works, so I'm not gonna fix it.



                                                                  Detect Users Removed



                                                                          for i in "${!A1[@]}"; do
                                                                  # check for users removed
                                                                  done


                                                                  Above loops through associative array keys



                                                                              if [ "${A2[$i]+x}" = "" ]; then


                                                                  Above uses variable substitution to detect the difference between a value that is unset vs a variable that has been explicitly set to a zero length string.



                                                                  Apparently, there are a lot of ways to see if a variable has been set. I chose the one with the most votes.



                                                                                  echo "$i has changed" > Output_File


                                                                  Above adds the user $i to the Output_File



                                                                  Detect Users Added or Changed



                                                                          USERSWHODIDNOTCHANGE=


                                                                  Above clears a variable so we can keep track of users that did not change.



                                                                          for i in "${!A2[@]}"; do
                                                                  # detect users added, changed and not changed
                                                                  done


                                                                  Above loops through associative array keys



                                                                              if ! [ "${A1[$i]+x}" != "" ]; then


                                                                  Above uses variable substitution to see if a variable has been set.



                                                                                  echo "$i was added as '${A2[$i]}'"


                                                                  Because $i is the array key (user name) $A2[$i] should return the value associated with the current user from File_2.txt.



                                                                  For example, if $i is User1, the above reads as ${A2[User1]}



                                                                                  echo "$i has changed" > Output_File


                                                                  Above adds the user $i to the Output_File



                                                                              elif [ "${A1[$i]}" != "${A2[$i]}" ]; then


                                                                  Because $i is the array key (user name) $A1[$i] should return the value associated with the current user from File_1.txt, and $A2[$i] should return the value from File_2.txt.



                                                                  Above compares the associated values for user $i from both files..



                                                                                  echo "$i has changed" > Output_File


                                                                  Above adds the user $i to the Output_File



                                                                                  if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                  USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                  fi
                                                                  USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"


                                                                  Above creates a comma separated list of users who did not change. Note there are no spaces in the list, or else the next check would need to be quoted.



                                                                          if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                  echo "no change: $USERSWHODIDNOTCHANGE"
                                                                  fi


                                                                  Above reports the value of $USERSWHODIDNOTCHANGE but only if there is a value in $USERSWHODIDNOTCHANGE. The way this is written, $USERSWHODIDNOTCHANGE cannot contain any spaces. If it does need spaces, above could be rewritten as follows:



                                                                          if [ "$USERSWHODIDNOTCHANGE" != "" ]; then
                                                                  echo "no change: $USERSWHODIDNOTCHANGE"
                                                                  fi





                                                                  share|improve this answer























                                                                    up vote
                                                                    1
                                                                    down vote










                                                                    up vote
                                                                    1
                                                                    down vote









                                                                    colcmp.sh



                                                                    Compares name/value pairs in 2 files in the format name valuen. Writes the name to Output_file if changed. Requires bash v4+ for associative arrays.



                                                                    Usage



                                                                    $ ./colcmp.sh File_1.txt File_2.txt
                                                                    User3 changed from 'US' to 'NG'
                                                                    no change: User1,User2


                                                                    Output_File



                                                                    $ cat Output_File
                                                                    User3 has changed


                                                                    Source (colcmp.sh)



                                                                    cmp -s "$1" "$2"
                                                                    case "$?" in
                                                                    0)
                                                                    echo "" > Output_File
                                                                    echo "files are identical"
                                                                    ;;
                                                                    1)
                                                                    echo "" > Output_File
                                                                    cp "$1" ~/.colcmp.array1.tmp.sh
                                                                    sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh
                                                                    sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array1.tmp.sh
                                                                    sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.array1.tmp.sh
                                                                    chmod 755 ~/.colcmp.array1.tmp.sh
                                                                    declare -A A1
                                                                    source ~/.colcmp.array1.tmp.sh

                                                                    cp "$2" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                    chmod 755 ~/.colcmp.array2.tmp.sh
                                                                    declare -A A2
                                                                    source ~/.colcmp.array2.tmp.sh

                                                                    USERSWHODIDNOTCHANGE=
                                                                    for i in "${!A1[@]}"; do
                                                                    if [ "${A2[$i]+x}" = "" ]; then
                                                                    echo "$i was removed"
                                                                    echo "$i has changed" > Output_File
                                                                    fi
                                                                    done
                                                                    for i in "${!A2[@]}"; do
                                                                    if [ "${A1[$i]+x}" = "" ]; then
                                                                    echo "$i was added as '${A2[$i]}'"
                                                                    echo "$i has changed" > Output_File
                                                                    elif [ "${A1[$i]}" != "${A2[$i]}" ]; then
                                                                    echo "$i changed from '${A1[$i]}' to '${A2[$i]}'"
                                                                    echo "$i has changed" > Output_File
                                                                    else
                                                                    if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    done
                                                                    if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    echo "no change: $USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    ;;
                                                                    *)
                                                                    echo "error: file not found, access denied, etc..."
                                                                    echo "usage: ./colcmp.sh File_1.txt File_2.txt"
                                                                    ;;
                                                                    esac




                                                                    Explanation



                                                                    Breakdown of the code and what it means, to the best of my understanding. I welcome edits and suggestions.



                                                                    Basic File Compare



                                                                    cmp -s "$1" "$2"
                                                                    case "$?" in
                                                                    0)
                                                                    # match
                                                                    ;;
                                                                    1)
                                                                    # compare
                                                                    ;;
                                                                    *)
                                                                    # error
                                                                    ;;
                                                                    esac


                                                                    cmp will set the value of $? as follows:




                                                                    • 0 = files match

                                                                    • 1 = files differ

                                                                    • 2 = error


                                                                    I chose to use a case..esac statement to evalute $? because the value of $? changes after every command, including test ([).



                                                                    Alternatively I could have used a variable to hold the value of $?:



                                                                    cmp -s "$1" "$2"
                                                                    CMPRESULT=$?
                                                                    if [ $CMPRESULT -eq 0 ]; then
                                                                    # match
                                                                    elif [ $CMPRESULT -eq 1 ]; then
                                                                    # compare
                                                                    else
                                                                    # error
                                                                    fi


                                                                    Above does the same thing as the case statement. IDK which I like better.



                                                                    Clear the Output



                                                                            echo "" > Output_File


                                                                    Above clears the output file so if no users changed, the output file will be empty.



                                                                    I do this inside the case statements so that the Output_file remains unchanged on error.



                                                                    Copy User File to Shell Script



                                                                            cp "$1" ~/.colcmp.arrays.tmp.sh


                                                                    Above copies File_1.txt to the current user's home dir.



                                                                    For example, if the current user is john, the above would be the same as cp "File_1.txt" /home/john/.colcmp.arrays.tmp.sh



                                                                    Escape Special Characters



                                                                    Basically, I'm paranoid. I know that these characters could have special meaning or execute an external program when run in a script as part of variable assignment:




                                                                    • ` - back-tick - executes a program and the output as if the output were part of your script

                                                                    • $ - dollar sign - usually prefixes a variable

                                                                    • ${} - allows for more complex variable substitution

                                                                    • $() - idk what this does but i think it can execute code


                                                                    What I don't know is how much I don't know about bash. I don't know what other characters might have special meaning, but I want to escape them all with a backslash:



                                                                            sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh


                                                                    sed can do a lot more than regular expression pattern matching. The script pattern "s/(find)/(replace)/" specifically performs the pattern match.



                                                                    "s/(find)/(replace)/(modifiers)"




                                                                    • (find) = ([^A-Za-z0-9 ])


                                                                      • () = capture group 1

                                                                      • = match a character from a specific list of characters

                                                                      • [^] = match any character NOT in a specific list of characters

                                                                      • [^A-Za-z0-9 ] = match any character that is NOT a letter, digit or space




                                                                    in english: capture any punctuation or special character as caputure group 1 (\1)




                                                                    • (replace) = \\\1


                                                                      • \\ = literal character (\) i.e. a backslash

                                                                      • \1 = capture group 1




                                                                    in english: prefix all special characters with a backslash




                                                                    • (modifiers) = g


                                                                      • g = globally replace




                                                                    in english: if more than one match is found on the same line, replace them all



                                                                    Comment Out the Entire Script



                                                                            sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.arrays.tmp.sh


                                                                    Above uses a regular expression to prefix every line of ~/.colcmp.arrays.tmp.sh with a bash comment character (#). I do this because later I intend to execute ~/.colcmp.arrays.tmp.sh using the source command and because I don't know for sure the whole format of File_1.txt.



                                                                    I don't want to accidentally execute arbitrary code. I don't think anyone does.



                                                                    "s/(find)/(replace)/"




                                                                    • (find) = ^(.*)$


                                                                      • ^ = beginning of a line

                                                                      • () = capture group 1

                                                                      • .* = anything

                                                                      • $ = end of line




                                                                    in english: capture each line as caputure group 1 (\1)




                                                                    • (replace) = #\1


                                                                      • # = literal character (#) i.e. a pound symbol or hash

                                                                      • \1 = capture group 1




                                                                    in english: replace each line with a pound symbol followed by the line that was replaced



                                                                    Convert User Value to A1[User]="value"



                                                                            sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.arrays.tmp.sh


                                                                    Above is the core of this script.




                                                                    • convert this: #User1 US


                                                                      • to this: A1[User1]="US"

                                                                      • or this: A2[User1]="US" (for the 2nd file)




                                                                    "s/(find)/(replace)/"




                                                                    • (find) = ^#\s*(\S+)\s+(\S.?)\s$


                                                                      • ^ = beginning of a line

                                                                      • # = literal character (#) i.e. a pound symbol or hash

                                                                      • \s* - zero or more whitespace characters

                                                                      • () = capture group 1

                                                                      • \S+ - one or more NON-whitespace characters

                                                                      • \s+ - one or more whitespace characters

                                                                      • () = capture group 2

                                                                      • \S - exactly one NON-whitespace character

                                                                      • .*? = anything, non-greedy

                                                                      • \s* - zero or more whitespace characters

                                                                      • $ = end of line




                                                                    in english:




                                                                    • require but ignore leading comment characters (#)

                                                                    • ignore leading whitespace

                                                                    • capture the first word as caputure group 1 (\1)

                                                                    • require a space (or tab, or whitespace)


                                                                      • that will be replaced with an equals sign because

                                                                      • it's not part of any capture group, and because

                                                                      • the (replace) pattern puts an equals sign between capture group 1 and capture group 2



                                                                    • capture the rest of the line as capture group 2



                                                                    • (replace) = A1\[\1\]="\2"




                                                                      • A1\[ - literal characters A1[ to start array assignment in an array called A1

                                                                      • \1 = capture group 1 - which does not include the leading hash (#) and does not include leading whitespace - in this case capture group 1 is being used to set the name of the name/value pair in the bash associative array.

                                                                      • \]=" = literal characters ]="



                                                                        • ] = close array assignment e.g. A1[User1]="US"


                                                                        • = = assignment operator e.g. variable=value


                                                                        • " = quote value to capture spaces ... although now that i think about it, it would have been easier to let the code above that backslashes everything to also backslash space characters.



                                                                      • \1 = capture group 2 - in this case, the value of the name/value pair

                                                                      • " = closing quote value to capture spaces




                                                                    in english: replace each line in the format #name value with an array assignment operator in the format A1[name]="value"



                                                                    Make Executable



                                                                            chmod 755 ~/.colcmp.arrays.tmp.sh


                                                                    Above uses chmod to make the array script file executable.



                                                                    I'm not sure if this is necessary.



                                                                    Declare Associative Array (bash v4+)



                                                                            declare -A A1


                                                                    The capital -A indicates that the variables declared will be associative arrays.



                                                                    This is why the script requires bash v4 or greater.



                                                                    Execute our Array Variable Assignment Script



                                                                            source ~/.colcmp.arrays.tmp.sh


                                                                    We have already:




                                                                    • converted our file from lines of User value to lines of A1[User]="value",

                                                                    • made it executable (maybe), and

                                                                    • declared A1 as an associative array...


                                                                    Above we source the script to run it in the current shell. We do this so we can keep the variable values that get set by the script. If you execute the script directly, it spawns a new shell, and the variable values are lost when the new shell exits, or at least that's my understanding.



                                                                    This Should Be a Function



                                                                            cp "$2" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                    chmod 755 ~/.colcmp.array2.tmp.sh
                                                                    declare -A A2
                                                                    source ~/.colcmp.array2.tmp.sh


                                                                    We do the same thing for $1 and A1 that we do for $2 and A2. It really should be a function. I think at this point this script is confusing enough and it works, so I'm not gonna fix it.



                                                                    Detect Users Removed



                                                                            for i in "${!A1[@]}"; do
                                                                    # check for users removed
                                                                    done


                                                                    Above loops through associative array keys



                                                                                if [ "${A2[$i]+x}" = "" ]; then


                                                                    Above uses variable substitution to detect the difference between a value that is unset vs a variable that has been explicitly set to a zero length string.



                                                                    Apparently, there are a lot of ways to see if a variable has been set. I chose the one with the most votes.



                                                                                    echo "$i has changed" > Output_File


                                                                    Above adds the user $i to the Output_File



                                                                    Detect Users Added or Changed



                                                                            USERSWHODIDNOTCHANGE=


                                                                    Above clears a variable so we can keep track of users that did not change.



                                                                            for i in "${!A2[@]}"; do
                                                                    # detect users added, changed and not changed
                                                                    done


                                                                    Above loops through associative array keys



                                                                                if ! [ "${A1[$i]+x}" != "" ]; then


                                                                    Above uses variable substitution to see if a variable has been set.



                                                                                    echo "$i was added as '${A2[$i]}'"


                                                                    Because $i is the array key (user name) $A2[$i] should return the value associated with the current user from File_2.txt.



                                                                    For example, if $i is User1, the above reads as ${A2[User1]}



                                                                                    echo "$i has changed" > Output_File


                                                                    Above adds the user $i to the Output_File



                                                                                elif [ "${A1[$i]}" != "${A2[$i]}" ]; then


                                                                    Because $i is the array key (user name) $A1[$i] should return the value associated with the current user from File_1.txt, and $A2[$i] should return the value from File_2.txt.



                                                                    Above compares the associated values for user $i from both files..



                                                                                    echo "$i has changed" > Output_File


                                                                    Above adds the user $i to the Output_File



                                                                                    if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"


                                                                    Above creates a comma separated list of users who did not change. Note there are no spaces in the list, or else the next check would need to be quoted.



                                                                            if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    echo "no change: $USERSWHODIDNOTCHANGE"
                                                                    fi


                                                                    Above reports the value of $USERSWHODIDNOTCHANGE but only if there is a value in $USERSWHODIDNOTCHANGE. The way this is written, $USERSWHODIDNOTCHANGE cannot contain any spaces. If it does need spaces, above could be rewritten as follows:



                                                                            if [ "$USERSWHODIDNOTCHANGE" != "" ]; then
                                                                    echo "no change: $USERSWHODIDNOTCHANGE"
                                                                    fi





                                                                    share|improve this answer












                                                                    colcmp.sh



                                                                    Compares name/value pairs in 2 files in the format name valuen. Writes the name to Output_file if changed. Requires bash v4+ for associative arrays.



                                                                    Usage



                                                                    $ ./colcmp.sh File_1.txt File_2.txt
                                                                    User3 changed from 'US' to 'NG'
                                                                    no change: User1,User2


                                                                    Output_File



                                                                    $ cat Output_File
                                                                    User3 has changed


                                                                    Source (colcmp.sh)



                                                                    cmp -s "$1" "$2"
                                                                    case "$?" in
                                                                    0)
                                                                    echo "" > Output_File
                                                                    echo "files are identical"
                                                                    ;;
                                                                    1)
                                                                    echo "" > Output_File
                                                                    cp "$1" ~/.colcmp.array1.tmp.sh
                                                                    sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh
                                                                    sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array1.tmp.sh
                                                                    sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.array1.tmp.sh
                                                                    chmod 755 ~/.colcmp.array1.tmp.sh
                                                                    declare -A A1
                                                                    source ~/.colcmp.array1.tmp.sh

                                                                    cp "$2" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                    chmod 755 ~/.colcmp.array2.tmp.sh
                                                                    declare -A A2
                                                                    source ~/.colcmp.array2.tmp.sh

                                                                    USERSWHODIDNOTCHANGE=
                                                                    for i in "${!A1[@]}"; do
                                                                    if [ "${A2[$i]+x}" = "" ]; then
                                                                    echo "$i was removed"
                                                                    echo "$i has changed" > Output_File
                                                                    fi
                                                                    done
                                                                    for i in "${!A2[@]}"; do
                                                                    if [ "${A1[$i]+x}" = "" ]; then
                                                                    echo "$i was added as '${A2[$i]}'"
                                                                    echo "$i has changed" > Output_File
                                                                    elif [ "${A1[$i]}" != "${A2[$i]}" ]; then
                                                                    echo "$i changed from '${A1[$i]}' to '${A2[$i]}'"
                                                                    echo "$i has changed" > Output_File
                                                                    else
                                                                    if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    done
                                                                    if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    echo "no change: $USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    ;;
                                                                    *)
                                                                    echo "error: file not found, access denied, etc..."
                                                                    echo "usage: ./colcmp.sh File_1.txt File_2.txt"
                                                                    ;;
                                                                    esac




                                                                    Explanation



                                                                    Breakdown of the code and what it means, to the best of my understanding. I welcome edits and suggestions.



                                                                    Basic File Compare



                                                                    cmp -s "$1" "$2"
                                                                    case "$?" in
                                                                    0)
                                                                    # match
                                                                    ;;
                                                                    1)
                                                                    # compare
                                                                    ;;
                                                                    *)
                                                                    # error
                                                                    ;;
                                                                    esac


                                                                    cmp will set the value of $? as follows:




                                                                    • 0 = files match

                                                                    • 1 = files differ

                                                                    • 2 = error


                                                                    I chose to use a case..esac statement to evalute $? because the value of $? changes after every command, including test ([).



                                                                    Alternatively I could have used a variable to hold the value of $?:



                                                                    cmp -s "$1" "$2"
                                                                    CMPRESULT=$?
                                                                    if [ $CMPRESULT -eq 0 ]; then
                                                                    # match
                                                                    elif [ $CMPRESULT -eq 1 ]; then
                                                                    # compare
                                                                    else
                                                                    # error
                                                                    fi


                                                                    Above does the same thing as the case statement. IDK which I like better.



                                                                    Clear the Output



                                                                            echo "" > Output_File


                                                                    Above clears the output file so if no users changed, the output file will be empty.



                                                                    I do this inside the case statements so that the Output_file remains unchanged on error.



                                                                    Copy User File to Shell Script



                                                                            cp "$1" ~/.colcmp.arrays.tmp.sh


                                                                    Above copies File_1.txt to the current user's home dir.



                                                                    For example, if the current user is john, the above would be the same as cp "File_1.txt" /home/john/.colcmp.arrays.tmp.sh



                                                                    Escape Special Characters



                                                                    Basically, I'm paranoid. I know that these characters could have special meaning or execute an external program when run in a script as part of variable assignment:




                                                                    • ` - back-tick - executes a program and the output as if the output were part of your script

                                                                    • $ - dollar sign - usually prefixes a variable

                                                                    • ${} - allows for more complex variable substitution

                                                                    • $() - idk what this does but i think it can execute code


                                                                    What I don't know is how much I don't know about bash. I don't know what other characters might have special meaning, but I want to escape them all with a backslash:



                                                                            sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array1.tmp.sh


                                                                    sed can do a lot more than regular expression pattern matching. The script pattern "s/(find)/(replace)/" specifically performs the pattern match.



                                                                    "s/(find)/(replace)/(modifiers)"




                                                                    • (find) = ([^A-Za-z0-9 ])


                                                                      • () = capture group 1

                                                                      • = match a character from a specific list of characters

                                                                      • [^] = match any character NOT in a specific list of characters

                                                                      • [^A-Za-z0-9 ] = match any character that is NOT a letter, digit or space




                                                                    in english: capture any punctuation or special character as caputure group 1 (\1)




                                                                    • (replace) = \\\1


                                                                      • \\ = literal character (\) i.e. a backslash

                                                                      • \1 = capture group 1




                                                                    in english: prefix all special characters with a backslash




                                                                    • (modifiers) = g


                                                                      • g = globally replace




                                                                    in english: if more than one match is found on the same line, replace them all



                                                                    Comment Out the Entire Script



                                                                            sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.arrays.tmp.sh


                                                                    Above uses a regular expression to prefix every line of ~/.colcmp.arrays.tmp.sh with a bash comment character (#). I do this because later I intend to execute ~/.colcmp.arrays.tmp.sh using the source command and because I don't know for sure the whole format of File_1.txt.



                                                                    I don't want to accidentally execute arbitrary code. I don't think anyone does.



                                                                    "s/(find)/(replace)/"




                                                                    • (find) = ^(.*)$


                                                                      • ^ = beginning of a line

                                                                      • () = capture group 1

                                                                      • .* = anything

                                                                      • $ = end of line




                                                                    in english: capture each line as caputure group 1 (\1)




                                                                    • (replace) = #\1


                                                                      • # = literal character (#) i.e. a pound symbol or hash

                                                                      • \1 = capture group 1




                                                                    in english: replace each line with a pound symbol followed by the line that was replaced



                                                                    Convert User Value to A1[User]="value"



                                                                            sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A1\[\1\]="\2"/" ~/.colcmp.arrays.tmp.sh


                                                                    Above is the core of this script.




                                                                    • convert this: #User1 US


                                                                      • to this: A1[User1]="US"

                                                                      • or this: A2[User1]="US" (for the 2nd file)




                                                                    "s/(find)/(replace)/"




                                                                    • (find) = ^#\s*(\S+)\s+(\S.?)\s$


                                                                      • ^ = beginning of a line

                                                                      • # = literal character (#) i.e. a pound symbol or hash

                                                                      • \s* - zero or more whitespace characters

                                                                      • () = capture group 1

                                                                      • \S+ - one or more NON-whitespace characters

                                                                      • \s+ - one or more whitespace characters

                                                                      • () = capture group 2

                                                                      • \S - exactly one NON-whitespace character

                                                                      • .*? = anything, non-greedy

                                                                      • \s* - zero or more whitespace characters

                                                                      • $ = end of line




                                                                    in english:




                                                                    • require but ignore leading comment characters (#)

                                                                    • ignore leading whitespace

                                                                    • capture the first word as caputure group 1 (\1)

                                                                    • require a space (or tab, or whitespace)


                                                                      • that will be replaced with an equals sign because

                                                                      • it's not part of any capture group, and because

                                                                      • the (replace) pattern puts an equals sign between capture group 1 and capture group 2



                                                                    • capture the rest of the line as capture group 2



                                                                    • (replace) = A1\[\1\]="\2"




                                                                      • A1\[ - literal characters A1[ to start array assignment in an array called A1

                                                                      • \1 = capture group 1 - which does not include the leading hash (#) and does not include leading whitespace - in this case capture group 1 is being used to set the name of the name/value pair in the bash associative array.

                                                                      • \]=" = literal characters ]="



                                                                        • ] = close array assignment e.g. A1[User1]="US"


                                                                        • = = assignment operator e.g. variable=value


                                                                        • " = quote value to capture spaces ... although now that i think about it, it would have been easier to let the code above that backslashes everything to also backslash space characters.



                                                                      • \1 = capture group 2 - in this case, the value of the name/value pair

                                                                      • " = closing quote value to capture spaces




                                                                    in english: replace each line in the format #name value with an array assignment operator in the format A1[name]="value"



                                                                    Make Executable



                                                                            chmod 755 ~/.colcmp.arrays.tmp.sh


                                                                    Above uses chmod to make the array script file executable.



                                                                    I'm not sure if this is necessary.



                                                                    Declare Associative Array (bash v4+)



                                                                            declare -A A1


                                                                    The capital -A indicates that the variables declared will be associative arrays.



                                                                    This is why the script requires bash v4 or greater.



                                                                    Execute our Array Variable Assignment Script



                                                                            source ~/.colcmp.arrays.tmp.sh


                                                                    We have already:




                                                                    • converted our file from lines of User value to lines of A1[User]="value",

                                                                    • made it executable (maybe), and

                                                                    • declared A1 as an associative array...


                                                                    Above we source the script to run it in the current shell. We do this so we can keep the variable values that get set by the script. If you execute the script directly, it spawns a new shell, and the variable values are lost when the new shell exits, or at least that's my understanding.



                                                                    This Should Be a Function



                                                                            cp "$2" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/([^A-Za-z0-9 ])/\\\1/g" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^(.*)$/#\1/" ~/.colcmp.array2.tmp.sh
                                                                    sed -i -E "s/^#\s*(\S+)\s+(\S.*?)\s*$/A2\[\1\]="\2"/" ~/.colcmp.array2.tmp.sh
                                                                    chmod 755 ~/.colcmp.array2.tmp.sh
                                                                    declare -A A2
                                                                    source ~/.colcmp.array2.tmp.sh


                                                                    We do the same thing for $1 and A1 that we do for $2 and A2. It really should be a function. I think at this point this script is confusing enough and it works, so I'm not gonna fix it.



                                                                    Detect Users Removed



                                                                            for i in "${!A1[@]}"; do
                                                                    # check for users removed
                                                                    done


                                                                    Above loops through associative array keys



                                                                                if [ "${A2[$i]+x}" = "" ]; then


                                                                    Above uses variable substitution to detect the difference between a value that is unset vs a variable that has been explicitly set to a zero length string.



                                                                    Apparently, there are a lot of ways to see if a variable has been set. I chose the one with the most votes.



                                                                                    echo "$i has changed" > Output_File


                                                                    Above adds the user $i to the Output_File



                                                                    Detect Users Added or Changed



                                                                            USERSWHODIDNOTCHANGE=


                                                                    Above clears a variable so we can keep track of users that did not change.



                                                                            for i in "${!A2[@]}"; do
                                                                    # detect users added, changed and not changed
                                                                    done


                                                                    Above loops through associative array keys



                                                                                if ! [ "${A1[$i]+x}" != "" ]; then


                                                                    Above uses variable substitution to see if a variable has been set.



                                                                                    echo "$i was added as '${A2[$i]}'"


                                                                    Because $i is the array key (user name) $A2[$i] should return the value associated with the current user from File_2.txt.



                                                                    For example, if $i is User1, the above reads as ${A2[User1]}



                                                                                    echo "$i has changed" > Output_File


                                                                    Above adds the user $i to the Output_File



                                                                                elif [ "${A1[$i]}" != "${A2[$i]}" ]; then


                                                                    Because $i is the array key (user name) $A1[$i] should return the value associated with the current user from File_1.txt, and $A2[$i] should return the value from File_2.txt.



                                                                    Above compares the associated values for user $i from both files..



                                                                                    echo "$i has changed" > Output_File


                                                                    Above adds the user $i to the Output_File



                                                                                    if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    USERSWHODIDNOTCHANGE=",$USERSWHODIDNOTCHANGE"
                                                                    fi
                                                                    USERSWHODIDNOTCHANGE="$i$USERSWHODIDNOTCHANGE"


                                                                    Above creates a comma separated list of users who did not change. Note there are no spaces in the list, or else the next check would need to be quoted.



                                                                            if [ x$USERSWHODIDNOTCHANGE != x ]; then
                                                                    echo "no change: $USERSWHODIDNOTCHANGE"
                                                                    fi


                                                                    Above reports the value of $USERSWHODIDNOTCHANGE but only if there is a value in $USERSWHODIDNOTCHANGE. The way this is written, $USERSWHODIDNOTCHANGE cannot contain any spaces. If it does need spaces, above could be rewritten as follows:



                                                                            if [ "$USERSWHODIDNOTCHANGE" != "" ]; then
                                                                    echo "no change: $USERSWHODIDNOTCHANGE"
                                                                    fi






                                                                    share|improve this answer












                                                                    share|improve this answer



                                                                    share|improve this answer










                                                                    answered Nov 30 at 1:33









                                                                    Jonathan

                                                                    1136




                                                                    1136






























                                                                        draft saved

                                                                        draft discarded




















































                                                                        Thanks for contributing an answer to Ask Ubuntu!


                                                                        • Please be sure to answer the question. Provide details and share your research!

                                                                        But avoid



                                                                        • Asking for help, clarification, or responding to other answers.

                                                                        • Making statements based on opinion; back them up with references or personal experience.


                                                                        To learn more, see our tips on writing great answers.





                                                                        Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                                        Please pay close attention to the following guidance:


                                                                        • Please be sure to answer the question. Provide details and share your research!

                                                                        But avoid



                                                                        • Asking for help, clarification, or responding to other answers.

                                                                        • Making statements based on opinion; back them up with references or personal experience.


                                                                        To learn more, see our tips on writing great answers.




                                                                        draft saved


                                                                        draft discarded














                                                                        StackExchange.ready(
                                                                        function () {
                                                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f515900%2fhow-to-compare-two-files%23new-answer', 'question_page');
                                                                        }
                                                                        );

                                                                        Post as a guest















                                                                        Required, but never shown





















































                                                                        Required, but never shown














                                                                        Required, but never shown












                                                                        Required, but never shown







                                                                        Required, but never shown

































                                                                        Required, but never shown














                                                                        Required, but never shown












                                                                        Required, but never shown







                                                                        Required, but never shown







                                                                        Popular posts from this blog

                                                                        Ellipse (mathématiques)

                                                                        Quarter-circle Tiles

                                                                        Mont Emei