Re: awk difference (RE) between HP-UX and Linux (2040 Views)
Reply
Valued Contributor
support_billa
Posts: 192
Registered: ‎06-27-2011
Message 1 of 14 (2,172 Views)
Accepted Solution

awk difference (RE) between HP-UX and Linux

hello,

 

i detect following awk difference between HP-UX and Linux SLES-11

i want to find a entry with RE :

 

file:

;;DB_DEF#field1#field2#FS#junk#junk#junk#junk

 

HPUX OK:

awk -v DB_TOKEN="DB_DEF" -F# '$1 ~ (+DB_TOKEN) { print $1,$2,$3,$4 }' file 

 

LINUX :

it found all other entries of the file, but not the exact entry .....

 

i think my RE is OK ? is it not a standard RE ?

when i change from " (+DB_TOKEN)" to "DB_TOKEN" it works for Linux.

 

regards

Frequent Advisor
Nighwish
Posts: 45
Registered: ‎06-27-2011
Message 2 of 14 (2,169 Views)

Re: awk difference (RE) between HP-UX and Linux

Hi

 

AWK syntaxes in different from HPUX and LINUX, there is nothing unusual in this behavior.

 

 

Regards.

Honored Contributor
Bill Hassell
Posts: 14,225
Registered: ‎05-29-2000
Message 3 of 14 (2,159 Views)

Re: awk difference (RE) between HP-UX and Linux

There are several versions of awk.  Many years ago, HP replaced standard awl with nawk but left the name the same. And then there's gawk -- which may be named awk too.

 

Here are some useful references. The first explains a lot of the design differences, the second is a great cheat sheet.

 

http://www.catonmat.net/blog/awk-nawk-and-gawk-cheat-sheet/
http://www.catonmat.net/download/awk.cheat.sheet.pdf

 

Advisor
BowlesCR
Posts: 20
Registered: ‎08-10-2011
Message 4 of 14 (2,155 Views)

Re: awk difference (RE) between HP-UX and Linux

All good explanations. I just wanted to throw in that you can download gawk from the Porting and Archive Centre (it installs to a different location than the system awk) and that should be nearly if not completely identical to the way it works in Linux
Valued Contributor
support_billa
Posts: 192
Registered: ‎06-27-2011
Message 5 of 14 (2,151 Views)

Re: awk difference (RE) between HP-UX and Linux

AWK syntaxes in different from HPUX and LINUX, there is nothing unusual in this behavior.

 in my case : what is the right behavior ? HPUX or LINUX ?

 

i use due to this thread below a lot of RE of AWK,

in the last part of the thread are good examples of James and Dennis :

replace a string with "/" in a variable

 

regards

Valued Contributor
support_billa
Posts: 192
Registered: ‎06-27-2011
Message 6 of 14 (2,147 Views)

Re: awk difference (RE) between HP-UX and Linux

i think LINUX awk is gawk, also i tested HPUX gawk and with the info of the thread awk is nawk

 

Info about Version

 

LINUX: Version
awk -W version
GNU Awk 3.1.8

HPUX: Version
gawk -W version
GNU Awk 3.1.5

 

awk -W version isn't allowed in HPUX

 

Test of LINUX and HPUX, different using of RE (+ or .* )


LINUX: OK
DB_TOKEN=DB_DEF
awk -F'#' '$1 ~ /^.*'"${DB_TOKEN}"'$/ { print $1,$2,$3,$4 }' file

awk -v DB_TOKEN="DB_DEF" -F# '$1 ~ (DB_TOKEN) { print $1,$2,$3,$4 }' file
awk -v DB_TOKEN="DB_DEF" -F# '$1 ~ DB_TOKEN { print $1,$2,$3,$4 }' file

LINUX:  NOTOK
awk -v DB_TOKEN="DB_DEF" -F# '$1 ~ (+DB_TOKEN) { print $1,$2,$3,$4 }' file

HPUX: OK
DB_TOKEN=DB_DEF
awk -F'#' '$1 ~ /^.*'"${DB_TOKEN}"'$/ { print $1,$2,$3,$4 }' file

/usr/local/bin/gawk -v DB_TOKEN="DB_DEF" -F# '$1 ~ (DB_TOKEN) { print $1,$2,$3,$4 }' file
/usr/local/bin/gawk -v DB_TOKEN="DB_DEF" -F# '$1 ~ DB_TOKEN   { print $1,$2,$3,$4 }' file
HPUX: NOT OK
/usr/local/bin/gawk -v DB_TOKEN="DB_DEF" -F# '$1 ~ (+DB_TOKEN) { print $1,$2,$3,$4 }' file

 

regards

Valued Contributor
support_billa
Posts: 192
Registered: ‎06-27-2011
Message 7 of 14 (2,145 Views)

Re: awk difference (RE) between HP-UX and Linux

i found a agreement between LINUX and HPUX :

 

awk -v DB_TOKEN="DB_DEF" -F# '$1 ~ (".+"DB_TOKEN) { print $1,$2,$3,$4 }' file

 

OK ?

 

but the options of ERE of gawk isn't possible to use for awk HPUX like r{n,m}  with possix :-((

 

Info:

 

[abc...]   character list, matches any of the characters abc....
[^abc...]  negated character list, matches any character except abc....
r1|r2      alternation: matches either r1 or r2.
r1r2       concatenation: matches r1, and then r2.
r+         matches one or more r's.
r*         matches zero or more r's.
r?         matches zero or one r's.
(r)        grouping: matches r.
r{n}
r{n,m}     One or two numbers inside braces denote an interval expression.  If  there  is  one  
       number  in  the braces,  the preceding regular expression r is repeated n times.  
       If there are two numbers separated by a comma, r is repeated n to m times.  If
       there is one number followed  by  a  comma,  then  r  is repeated at least n times.
           Interval  expressions are only available if either --posix or --re-interval is
       specified on the command line.


Honored Contributor
Bill Hassell
Posts: 14,225
Registered: ‎05-29-2000
Message 8 of 14 (2,133 Views)

Re: awk difference (RE) between HP-UX and Linux

...but the options of ERE of gawk isn't possible to use for awk HPUX like r{n,m}  with possix :-((

 

The POSIX shell is no different than ksh or bash. Braces (and parenthesis and semicolons, etc) have special meaning to the shell and must therefore be excluded from shell processing. There is no problem at all if the awk statements are in an awk script, but on the command line, you must must single quotes (apostrophes) to turn off shell processing.

Acclaimed Contributor
Dennis Handly
Posts: 25,291
Registered: ‎03-06-2006
Message 9 of 14 (2,130 Views)

Re: awk difference (RE) between HP-UX and Linux

>awk -v DB_TOKEN="DB_DEF" -F# '$1 ~ (".+" DB_TOKEN) { print $1,$2,$3,$4 }' file

 

I assume this is required since you need to do string concatenation and you need that "." before the "+".

 

>but the options of ERE of gawk isn't possible to use for awk HP-UX like r{n,m}  with POSIX

 

Do you have an example where it fails?

 

 

Acclaimed Contributor
Dennis Handly
Posts: 25,291
Registered: ‎03-06-2006
Message 10 of 14 (2,112 Views)

Re: awk difference (RE) between HP-UX and Linux

>I think my RE is OK?  Is it not a standard RE?

 

No, this is a bogus ERE, in that it most likely won't do anything useful.

 

>>I assume this is required since you need to do string concatenation and you need that "." before the "+".

 

Yes.  This is the problem. Error recovery is different between the two versions of awk.

 

It appears HP-UX's version is broken.  The Posix standard says to convert a string to a number it should use atof(3).  Unfortunately it doesn't mention clearly if the string is bogus, you get 0.

 

You can see this if you change awk to add:

BEGIN { print "ERE:", (+DB_TOKEN) }

 

For HP-UX, it seems to ignore the unary "+" as do nothing and it prints: DB_DEF

For gawk, it honors unary "+" and converts the bogus string and prints: 0

 

So if you want your ERE to skip one or more chars, you need: (".+" DB_TOKEN)

Valued Contributor
support_billa
Posts: 192
Registered: ‎06-27-2011
Message 11 of 14 (2,102 Views)

Re: awk difference (RE) between HP-UX and Linux

Hello Dennis ,

thank you very much about your useful and helpful informations

>Do you have an example where it fails?
>but the options of ERE of gawk isn't possible to use for awk HPUX like r{n,m}  with possix

when you try to use HPUX awk and LINUX awk / HPUX gawk, then you get syntax errors !

> No, this is a bogus ERE, in that it most likely won't do anything useful.

so it works, but i have no guarantee that this ERE will work in the future , right ?

>You can see this if you change awk to add:
>BEGIN { print "ERE:", (+DB_TOKEN) }

i tested it , thank.

my last question ( i spoke with other colleagues ) :

is better to use "perl" with "ERE" for different plattforms ?

regards

Acclaimed Contributor
Dennis Handly
Posts: 25,291
Registered: ‎03-06-2006
Message 12 of 14 (2,087 Views)

Re: awk difference (RE) between HP-UX and Linux

>when you try to use HP-UX awk and LINUX awk / HP-UX gawk, then you get syntax errors!

 

Do you have an example of that?


>so it works, but I have no guarantee that this ERE will work in the future, right?

 

No, it doesn't really work.  Your ERE is bogus.  A leading "+" doesn't make sense for a ERE.

 

>is better to use "perl" with "ERE" for different platforms?

 

In this case, if you have valid ERE, it should work in both cases.

But you may have a valid point, perl may be more portable and doesn't have the HP-UX awk limitations.

And matching has more choices.

Valued Contributor
support_billa
Posts: 192
Registered: ‎06-27-2011
Message 13 of 14 (2,082 Views)

Re: awk difference (RE) between HP-UX and Linux

you will get examples next week

 

regards

Valued Contributor
support_billa
Posts: 192
Registered: ‎06-27-2011
Message 14 of 14 (2,040 Views)

Re: awk difference (RE) between HP-UX and Linux

hello,

here my solution of perl ( instead of awk ) :

 

DB_TOKEN=DB_DEF perl -lan -F"#" -e'if($F[0] =~ /^\S+$ENV{DB_TOKEN}/) {print "$F[1] $F[4]"}' file

or

perl -F"#" -lane '$s=shift @F;if($s=~ /^\S+'$DB_TOKEN'/) {print "@F"}' file

 

with a little help of perl examples

regards

The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.