Re: Need help (367 Views)
Reply
Occasional Contributor
suchitra
Posts: 4
Registered: ‎03-21-2006
Message 1 of 15 (367 Views)

Need help

I have a problem with comparing the columns of 2 files and write a status like add, modified or delete in the outfile along with the records of these 2 file.
For ex :

consider my 1st file has say
file1
==========
p1|y|500
p2|n|500

file2
======
p1| n| 500
p3 | y|501

now my output file should be like this

output file
=========
p1 | n| 500 |c b'cas it has been changed from y to n in the 1st file
p2| n |500 | d b'cas p2 is deleted in the 2nd file
p3 | y| 501 | a b'cas p3 is added in the 2nd file

could you please help me out. I want either a shell script or a awk command .


Please use plain text.
Honored Contributor
Muthukumar_5
Posts: 4,030
Registered: ‎06-09-2004
Message 2 of 15 (367 Views)

Re: Need help

Use this:

awk -F"\|" '{ var=$0;var1=$1;var2=$2;var3=$3;getline < "file2";split($0,a,"|"); if ( a[1] == var1 ) { if ( a[2] != var2 ) { print var"## c bcas "var2" is changed to "a[2];} if ( a[3] != var3 ) { print var"## c bcas "var3" is changed in 2nd file";} }
else { print var"## d bcas "var1" is deleted in 2nd file";print $0" ## a bcas "a[1]" is deleted in 2nd file";}}' file1

--
Muthu
Easy to suggest when don't know about the problem!
Please use plain text.
Honored Contributor
Steve Steel
Posts: 2,909
Registered: ‎02-13-2000
Message 3 of 15 (367 Views)

Re: Need help

Hi

Look at the comm command

comm - select or reject lines common to two sorted files

And www.shelldorado.com


Steve Steel
If you want truly to understand something, try to change it. (Kurt Lewin)
Please use plain text.
Honored Contributor
Peter Godron
Posts: 4,470
Registered: ‎02-13-2002
Message 4 of 15 (367 Views)

Re: Need help

Suchitra,
my ititial solution:

#!/usr/bin/sh
echo "Changed"
echo "`join -j 1 -t '|' -o 1.1 2.2 1.3 file1 file2`|c"
echo "Deleted"
cut -f1 -d '|' file1 > file1.bck
cut -f1 -d '|' file2 > file2.bck
grep `comm -23 file1.bck file2.bck` file1 > file1.res
sed "1,$ s/$/|d/" file1.res
rm file1.res
echo "Added"
grep `comm -13 file1.bck file2.bck` file2 > file2.res
sed "1,$ s/$/|a/" file2.res
rm file2.res
rm file1.bck
rm file2.bck
Please use plain text.
Honored Contributor
Peter Godron
Posts: 4,470
Registered: ‎02-13-2002
Message 5 of 15 (367 Views)

Re: Need help

Muthukumar,
very smooth script!
The first "print var" could be replaced by:
print a[1]"|"a[2]"|"a[3]
to pick up the file2 values, rather than file1.
Please use plain text.
Honored Contributor
Senthil Kumar .A_1
Posts: 654
Registered: ‎11-05-2003
Message 6 of 15 (367 Views)

Re: Need help

Hi,

I have attached a script that uses "comm" command for your situation.

Regards,
Senthil Kumar .A
Let your effort be such, the very words to define it, by a layman - would sound like a "POETRY" ;)
Please use plain text.
Honored Contributor
Muthukumar_5
Posts: 4,030
Registered: ‎06-09-2004
Message 7 of 15 (367 Views)

Re: Need help

Peter,

To avoid to print a whole line as a[1]"|"a[2]"|"a[3], I have stored in a separate variable. It is help ful to simply script ;)

--
Muthu
Easy to suggest when don't know about the problem!
Please use plain text.
Honored Contributor
Peter Godron
Posts: 4,470
Registered: ‎02-13-2002
Message 8 of 15 (367 Views)

Re: Need help

Suchitra,
something else to keep in mind is that any script using comm would only work on sorted files.

Also, are there any duplicate keys like:
p1|y|500
p1|n|300
.
.

Please use plain text.
Honored Contributor
Muthukumar_5
Posts: 4,030
Registered: ‎06-09-2004
Message 9 of 15 (367 Views)

Re: Need help

Using perl:

#!/usr/bin/perl

open FD1,"file1" || die "Open Error: $!";
open FD2,"file2" || die "Open Error: $!";

@arr1=;
@arr2=;

for ($i=0;$i<@arr1;$i++)
{
@pat1=split (/\|/,$arr1[$i]);
@pat2=split (/\|/,$arr2[$i]);

$arr1[$i]=~chomp($arr1[$i]);
$arr2[$i]=~chomp($arr2[$i]);

if ( $pat1[0] eq $pat2[0] )
{
if ( $pat1[1] ne $pat2[1] )
{

print "$arr1[$i] | c b'cas it has been changed from $pat1[1] to $pat2[1] in field 2 in 2nd file\n";
}
if ( $pat1[2] ne $pat2[2] )
{
print "$arr1[$i] | c b'cas it has been changed from $pat1[2] to $pat2[2] in field 3 in 2nd file\n";
}

}
else
{
print "$arr1[$i] | d b'cas $pat1[0] is deleted in 2nd file\n";
print "$arr2[$i] | a b'cas $pat1[0] is added in 2nd file\n";
}

}

# END

--
Muthu
Easy to suggest when don't know about the problem!
Please use plain text.
Honored Contributor
Peter Godron
Posts: 4,470
Registered: ‎02-13-2002
Message 10 of 15 (367 Views)

Re: Need help

Suchitra,
do these answers solve your problem?
Can you please have a look at:
http://forums1.itrc.hp.com/service/forums/helptips.do?#28
and then update the record.
Please use plain text.
Honored Contributor
Hein van den Heuvel
Posts: 6,585
Registered: ‎05-19-2003
Message 11 of 15 (367 Views)

Re: Need help

Hmmm, Muthu... I fail to see how you can solve the problem described with the simple array comparison you suggest. It seems clear to me that any solution needs to focus on the first, 'key' field.
How else can one decided whether a a new record appeared in the same place where an old record was deleted?

Anyway...

For a large file, it probably needs to be pre-sorted and two two files read simultaneously comparing key values to keep then in sync.
I presented one example of this is in:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=999120


For small files you can just 'slurp' them into a perl associative array, and report based on keys.

Here is an example that only compares the first non-field. It is easily adapted to compare other fields, or just everything except the key.

------ compare.pl ---------
$file = shift;
open (FILE, "<$file") or die "Failed to open first file: $file.";
while () {
chomp;
($key,$flag,$num) = split (/\|/, $_);
print "$. $key,$flag,$num\n";
$f1_flag{$key} = $flag;
$f1_num{$key} = $num;
}

$file = shift;
open (FILE, "<$file") or die "Failed to open second file: $file.";
while () {
chomp;
($key,$flag,$num) = split (/\|/, $_);
print "$. $key,$flag,$num\n";
$f2_flag{$key} = $flag;
$f2_num{$key} = $num;
}

for $key (sort keys %f2_flag) {
$change = "=";
if (defined $f1_flag{$key}) {
$change = "c" if ($f1_flag{$key} ne $f2_flag{$key});
delete $f1_flag{$key};
} else {
$change = "a";
}
print "$key|$f2_flag{$key}|$f2_num{$key}| $change\n"
}

for $key (sort keys %f1_flag) {
print "$key|$f1_flag{$key}|$f1_num{$key}| d\n"
}

---- usage example ----

C:\Temp>type file1.tmp
p1|y|500
p2|n|500
p5|y|500
C:\Temp>type file2.tmp
p1|n|500
p3|y|501
p5|y|500
C:\Temp>perl tmp.pl file1.tmp file2.tmp
p1|n|500 | c
p3|y|501 | a
p5|y|500 | =
p2|n|500 | d


if this input is treated as 'key' and everything else then it simplyfies some

------- compare_2.pl ----------

$file = shift;
open (FILE, "<$file") or die "Failed to open first file: $file.";
while () {
chomp;
$f1{$`} = $' if /\|/;
}

$file = shift;
open (FILE, "<$file") or die "Failed to open second file: $file.";
while () {
chomp;
$f2{$`} = $' if /\|/;
}

for $key (sort keys %f2) {
$change = "=";
if (defined $f1{$key}) {
$change = "c" if ($f1{$key} ne $f2{$key});
delete $f1{$key};
} else {
$change = "a";
}
print "$key|$f2{$key}| $change\n"
}

for $key (sort keys %f1) {
print "$key|$f1{$key}| d\n"
}

Please use plain text.
Honored Contributor
Hein van den Heuvel
Posts: 6,585
Registered: ‎05-19-2003
Message 12 of 15 (367 Views)

Re: Need help

Mind you, Muthu's script may be great IF the data follows strict patterns. But I do not think it works as requested. For example with a single deleted line:

C:\Temp>type file1.tmp
p1|y|500
p2|n|500
p3|y|500
p4|y|500
p5|y|500
C:\Temp>type file2.tmp
p1|n|500
p3|y|500
p4|y|500
p5|y|500
C:\Temp>perl test.pl
p2|n|500 | d b'cas p2 is deleted in 2nd file
p3|y|500 | a b'cas p2 is added in 2nd file
p3|y|500 | d b'cas p3 is deleted in 2nd file
p4|y|500 | a b'cas p3 is added in 2nd file
p4|y|500 | d b'cas p4 is deleted in 2nd file
p5|y|500 | a b'cas p4 is added in 2nd file
p5|y|500 | d b'cas p5 is deleted in 2nd file
| a b'cas p5 is added in 2nd file

Using the script I suggest:

p1|n|500 | c
p3|y|500 | =
p4|y|500 | =
p5|y|500 | =
p2|n|500 | d

Of course my method will fail if the order is critical, and not this column 1 value.

If the "=" lines are not desirable, then change the code to make the print conditional:
print "$key|$f2{$key}| $change\n" unless $change eq "=";

or re-arrange the core look some. For example:

for $key (sort keys %f2) {
$change = "c";
if ($x = $f1{$key}) {
delete $f1{$key};
next if ($x eq $f2{$key});
} else {
$change = "a";
}
print "$key|$f2{$key}| $change\n"
}


Result for last input example:

C:\Temp>perl compare.pl file1.tmp file2.tmp
p1|n|500 | c
p2|n|500 | d


Ok, lunch break over, back to real work...
:-)

Hein.



Please use plain text.
Honored Contributor
Sandman!
Posts: 2,220
Registered: ‎01-13-2005
Message 13 of 15 (367 Views)

Re: Need help

Hi Suchitra,

I'ave pasted a shell script below that satisfies the requirements for parsing and filtering input files according to your criteria:

========================myparser.sh========================
#!/bin/sh

set -a

InFile1=f1
InFile2=f2
#
SortFile1=f1s
SortFile2=f2s
#
OutFile=outfile

# Zero out the outputfile
# before input processing
cat /dev/null > $OutFile

# Sort both file 1 and 2 on the first field
# using the vertical bar as field separator
sort -t"|" -k1 $InFile1 > $SortFile1
sort -t"|" -k1 $InFile2 > $SortFile2

# Filter out lines common to both files
# and print out those that have changed
join -t"|" $SortFile1 $SortFile2 | awk -F"|" '
BEGIN {OFS="|"}
{if($2!=$4)print $1,$4,$NF,"c b'\''cas "$1" changed from "$2" to "$4" in 2nd file"}' >>$OutFile

# Print out all the unmatched lines in
# sorted file 1 and flag them as deleted
join -t"|" -v1 $SortFile1 $SortFile2 | awk -F"|" '
BEGIN{OFS="|"} {print $0,"d b'\''cas "$1" is deleted in 2nd file"}' >>$OutFile

# Print out all the unmatched lines in
# sorted file 2 and flag them as added
join -t"|" -v2 $SortFile1 $SortFile2 | awk -F"|" '
BEGIN{OFS="|"} {print $0,"a b'\''cas "$1" is added in 2nd file"}' >>$OutFile
===========================================================

Copy the code into a file name of your choice, customize the environment (InFile, SortFile etc.) to your system, make it executable and run it at the command line.

hope it helps!
Please use plain text.
Honored Contributor
Sandman!
Posts: 2,220
Registered: ‎01-13-2005
Message 14 of 15 (367 Views)

Re: Need help

Matter of fact the code might be easier to understand (as well as copy 'n paste) if attached instead of pasted...so click on the attachment for the shell script.

cheers!
Please use plain text.
Occasional Contributor
suchitra
Posts: 4
Registered: ‎03-21-2006
Message 15 of 15 (367 Views)

Re: Need help

Thanks a lot guys ... I got the solution I wanted. It was a great help from all of you guys.
Thanks a lot.
Please use plain text.
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation