Re: sh script - find string in two different files and compare (506 Views)
Reply
Super Advisor
Ratzie
Posts: 852
Registered: ‎01-10-2002
Message 1 of 7 (545 Views)

sh script - find string in two different files and compare

I think I am going to have a hard time explaining this one, so sorry in advance...

 

I have a file that contains multiple entries:

.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR
 
.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/52728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR

 

This goes on and on...

I have another file almost identical it may contain the same (UPDATE/7digits) it may not.

What I want to do it take the UPDATE/<7digits>

Look it up in 2nd file, and if it exists, compare the ACCOUNT and see if it is different.

 

I can get the TN part:

grep UPDATE * | awk '{print $2}'|sed 's/,,,//g' |sort -u > file

 

Then do a:

for tn in `cat file`
do

grep $tn second.file

...

 

But, I have no idea how to capture the ACCOUNT information from one file, and compare to second...

Appreciate the help.

Please use plain text.
Honored Contributor
Patrick Wallek
Posts: 13,752
Registered: ‎06-21-2000
Message 2 of 7 (532 Views)

Re: sh script - find string in two different files and compare

What do you think of this:

 

# cat file1
.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/52728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR

# cat file2
.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/92728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR


# cat script
#!/usr/bin/sh

for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g')
do
FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}')
FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file2 | awk -F \/ '{print $2}')
if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then
   echo "The Account numbers are the same in FILE1 and FILE2 for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; FILE2 ACCT# = ${FILE2ACCT}"
   echo ""
else
   echo "The Account numbers are DIFFERENT in FILE1 and FILE2 for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; FILE2 ACCT# = ${FILE2ACCT}"
   echo ""
fi
done

 And here's what it looks like when the script is run:

 

# ./script
The Account numbers are the same in FILE1 and FILE2 for update number 5552166619
Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; FILE2 ACCT# = 52727963

The Account numbers are DIFFERENT in FILE1 and FILE2 for update number 5552194161
Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; FILE2 ACCT# = 92728912

 The key is the 'sed -n' statement above.

 

It searches through the file for the value of the UPDATE# (hopefully there is never more than 1 occurrence of any particular update number in a file) obtained from file1 and looks for the corresponding account numbers in both file1 and file2 by printing the 2nd line below the UPDATE #.  This also assumes that the Account number is always 2 lines below the Update number.

Please use plain text.
Super Advisor
Ratzie
Posts: 852
Registered: ‎01-10-2002
Message 3 of 7 (527 Views)

Re: sh script - find string in two different files and compare

I will try, but the file2 is tricking me as I need to look the directory that has muliple files in it for the TN... Then pull the account and check.
Please use plain text.
Honored Contributor
Patrick Wallek
Posts: 13,752
Registered: ‎06-21-2000
Message 4 of 7 (525 Views)

Re: sh script - find string in two different files and compare

Is the FILE1 file in same directory as the other files you need to check?

Please use plain text.
Honored Contributor
Patrick Wallek
Posts: 13,752
Registered: ‎06-21-2000
Message 5 of 7 (523 Views)

Re: sh script - find string in two different files and compare

OK, file1 is the same as above and is in the /root/pw directory.

 

I have created 2 other files called file3 and file4 in the /root/pw/test directory.

 

Here are the files, the script and the results:

 

# pwd
/root/pw

# cat test/file3
.BEGIN
UPDATE/1234567890,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/92728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR


# cat test/file4
.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/2345678901,,,
.DELETE_ALL RELATED
ACCOUNT/92728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR


# cat script
#!/usr/bin/sh

for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g')
do
FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}')
UPDATEFILE=$(grep -l ${UPDATE} /root/pw/test/*)
FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" ${UPDATEFILE} | awk -F \/ '{print $2}')
if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then
   echo "The Account numbers are the same in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
   echo ""
else
   echo "The Account numbers are DIFFERENT in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
   echo ""
fi
done


# ./script
The Account numbers are the same in FILE1 and /root/pw/test/file4 for update number 5552166619
Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; /root/pw/test/file4 ACCT# = 52727963

The Account numbers are DIFFERENT in FILE1 and /root/pw/test/file3 for update number 5552194161
Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; /root/pw/test/file3 ACCT# = 92728912

 The 'grep -l' in the script searches through the files in /root/pw/test and returns the filename of the file with the same UPDATE number.  The sed statement for FILE2ACCT then looks for the ACCT# in the file returned by the 'grep -l' command.

Please use plain text.
Honored Contributor
Patrick Wallek
Posts: 13,752
Registered: ‎06-21-2000
Message 6 of 7 (522 Views)

Re: sh script - find string in two different files and compare

I have just added a check so that is an UPDATE # from file1 is NOT found in any files in the /root/pw/test directory, then the script will continue on.  My previous versions just hung.

 

NEW FILE1

# cat file1
.BEGIN
UPDATE/4567890123,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/52728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR



NEW SCRIPT

# cat script
#!/usr/bin/sh

for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g')
do
FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}')
UPDATEFILE=$(grep -l ${UPDATE} /root/pw/test/*)
if [[ ${UPDATEFILE} != "" ]] ; then
   FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" ${UPDATEFILE} | awk -F \/ '{print $2}')
   if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then
      echo "The Account numbers are the same in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
      echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
      echo ""
   else
      echo "The Account numbers are DIFFERENT in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
      echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
      echo ""
   fi
fi
done


# ./script
The Account numbers are the same in FILE1 and /root/pw/test/file4 for update number 5552166619
Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; /root/pw/test/file4 ACCT# = 52727963

The Account numbers are DIFFERENT in FILE1 and /root/pw/test/file3 for update number 5552194161
Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; /root/pw/test/file3 ACCT# = 92728912

 

Please use plain text.
Acclaimed Contributor
Dennis Handly
Posts: 24,957
Registered: ‎03-06-2006
Message 7 of 7 (506 Views)

Re: sh script - find string in two different files and compare

[ Edited ]

Here is something a little easier to understand and is performant since it uses a hash and reads each file once:

 

awk -v master=file1 '
# finds the number after "/" and before any ","
function crack_number(field) {
   i = split(field, fields, "[/,]")
#   print "found", i, "fields:", fields[2]
   return fields[2] ""  # make sure it is a string
}
BEGIN {
# create a map from update # to account #
while (getline < master > 0) {
   if ($1 ~ "UPDATE") {
      update = crack_number($1)
      continue
   }
   if ($1 ~ "ACCOUNT") {
      account = crack_number($1)
#      print update "|" account
      map[update] = account
      continue
   }
}
close(master)
}
/UPDATE/ {
   update = crack_number($1)
   next
}
/ACCOUNT/ {
   account = crack_number($1)
   if (update == "") {
      print "No update # for account", account
      next
   }
   account_m = map[update]
   if (account_m == "") {
#      print "update number", update, "in", FILENAME, "skipped"
      update = ""
      next
   }
   if (account == account_m) {
      print "The Account numbers are the same in FILE1 and", FILENAME, "for update number", update
      print "Update # =", update "; FILE1 ACCT# =", account_m, "; FILE2 ACCT# =", account
   } else {
      print "The Account numbers are DIFFERENT in FILE1 and", FILENAME, "for update number", update
      print "Update # =", update "; FILE1 ACCT# =", account_m, "; FILE2 ACCT# =", account
   }
   print ""
   update = ""
}' file3 file4

Please use plain text.
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation