To quote or not to quote: shell scripting ? (229 Views)
Reply
Occasional Visitor
Jim Kolberg
Posts: 4
Registered: ‎08-23-2004
Message 1 of 8 (229 Views)
Accepted Solution

To quote or not to quote: shell scripting ?

A function I have been using for months broke today. It would work everywhere except my home directory. The following script that echos "ABC" into tr faithfully reproduces the problem and reduces it to the simplist form. I finally discovered that the existance of a file simply named "p" would cause this to fail.

Is tr pulling its input params from the contents of the directory? Is there anyway to enforce "strict" quoting so I can find any code that isn't using quotes, but should be.

This is the 2nd production script in about 2 weeks that failed and the fix was to put quotes around a value that hadn't been quoted.

#This works as expected
$ echo ABC | tr [:upper:] [:lower:]
abc

#existance of p causes error
$ touch p
$ echo ABC | tr [:upper:] [:lower:]
tr: String2 contains an invalid character class.

$ rm p
$ echo ABC | tr [:upper:] [:lower:]
abc

#No idea what this one is doing
$ touch o
$ echo ABC | tr [:upper:] [:lower:]
oBC

$ rm o
$ echo ABC | tr [:upper:] [:lower:]
abc

#Doesn't work, but no error.
$ touch e
$ echo ABC | tr [:upper:] [:lower:]
ABC

#Return to a known working condition
$ rm e
$ echo ABC | tr [:upper:] [:lower:]
abc

#quotes fix the problem
$ touch p
$ echo ABC | tr "[:upper:]" "[:lower:]"
abc

upper and lower are both defined in my C.src locale.
$ locale
LANG=
LC_CTYPE="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_MESSAGES="C"
LC_ALL=
Honored Contributor
curt larson_1
Posts: 764
Registered: ‎08-23-2002
Message 2 of 8 (229 Views)

Re: To quote or not to quote: shell scripting ?

Is tr pulling its input params from the contents of the directory?

well no. right from the tr man page:
If necessary, string1 and string2 can be quoted to avoid pattern matching by the shell. and you'll notice all the examples
on the man page use quotes.

[:class:] are patterns that are expanded by the shell if not quoted.


Is there anyway to enforce "strict" quoting so I can find any code that isn't using quotes, but should be.

no there isn't. there isn't any substitution for making the script do what you want it to do and not what you scripted.
Honored Contributor
Stuart Browne
Posts: 3,032
Registered: ‎02-20-2002
Message 3 of 8 (229 Views)

Re: To quote or not to quote: shell scripting ?

My thought on the matter is simple.

Always quote.

echo "ABC" | tr "[:upper:]" "[:lower:]"

You never know when some entropy will cause you grief like this. Always make sure the the strings are explicit, and quoting makes sure of this.

Without it, both the []'s and :'s can be mis-interpreted in a shell. Not to mention characters like ?, $ and *.

The other advantage is when using filenames of which may/may not have spaces in them.

And lets face it. What are you saving by not quoting? In a large routine, you might use 1000 quote symbols. Woo. A k. How many terrabytes is your array?

It's something I've never understood about some people's attitudes to scripting. Just 'cause you dont *have* to put a quote, they leave it out, even though there are advantages to putting them in there.

Now, as to why 'p' causes issues, and not 'e' (for the record 'u', 'l', 'o' and 'w' causes issues as well), I'm picking it's due to the fact that these 5 letters are only in one subset of letters, and not both.

As for why the shell is using filenames, I'm unsure, but the 'GNU' tr does the same thing, on a completely different OS, platform and shell. ('tr' peculiarity?).
One long-haired git at your service...
Honored Contributor
Muthukumar_5
Posts: 4,030
Registered: ‎06-09-2004
Message 4 of 8 (229 Views)

Re: To quote or not to quote: shell scripting ?

The environment variable setting for touch and tr command are same. They will use C for all locale if they don't have separate settings.

We have to use tr for class changes / string changes wtih quotes only. It will take the effort. Else it will make errors out there.

All examples in man pages are inclined with "". We can not expect the modified behaviour out there.

But the different behaviour to p and e is getting narrow? Developement team has to lookup this one.

Note: Linux will make error at once if you don't use "" for class / string representation.

If you want to lookup more then prepare full details on the beviour between tr and touch command.

Test report script
--------------------
#!/usr/bin/ksh
ARR="a b c d e f g h i j k l m n o p q r s t u v x y z"
set -A TEST $ARR

index=0

while [[ $index -lt ${#TEST[*]} ]]
do

touch ${TEST[$index]}
# Check locale settings
# locale

echo ABC | tr [:upper:] [:lower:]
# Check local settings
# locale

let index=index+1
done

Modify yourself to automate to know the problem ( uncomment locale to know the locale settings )

Regards
Muthu
Easy to suggest when don't know about the problem!
Occasional Visitor
Jim Kolberg
Posts: 4
Registered: ‎08-23-2004
Message 5 of 8 (229 Views)

Re: To quote or not to quote: shell scripting ?

Thanks for all the replies. No need to investigate further. I guess I just developed a lazy quoting habbit and never got bitten by it before.
Honored Contributor
john korterman
Posts: 1,117
Registered: ‎11-15-2000
Message 6 of 8 (229 Views)

Re: To quote or not to quote: shell scripting ?

Hi,

the problem is of course, as stated, the missing quotes. Without quotes the shell performs filename expansion in the current directory, expanding whatever it can. You can check what happens this way:


cd to an empty directory and check that your first command works as expected.


$ echo ABC | tr [:upper:] [:lower:]
abc



Now set the mode for showing which commands are actually executed:

$ set -x
$


The actual execution of your first command looks like this:

$ echo ABC | tr [:upper:] [:lower:]
+ echo ABC
+ tr [:upper:] [:lower:]
abc
$

Not a big surprise. When unprotected by quotes the shell believes that the character class definitions are filenames, but it takes the presence of a one-char-name file in the current directory to trigger this.


$ touch e
+ touch e


$ echo ABC | tr [:upper:] [:lower:]
+ echo ABC
+ tr e e
ABC
$

Both [:upper:] and [:lower:] are expanded because the letter e is common to both *filenames*. You will se the same behaviour for a file named r.


$ rm e
+ rm e
$


For the case of the file named p, it will only affect [:upper:] because only this contain a p, e.g.:

$ touch p
+ touch p
$


$ echo ABC | tr [:upper:] [:lower:]
+ echo ABC
+ tr p [:lower:]
tr: String2 contains an invalid character class.
$

Not always easy to foresee how the shell behaves!


regards,

John K.



it would be nice if you always got a second chance
Occasional Visitor
Jim Kolberg
Posts: 4
Registered: ‎08-23-2004
Message 7 of 8 (229 Views)

Re: To quote or not to quote: shell scripting ?

Now I understand why. Thanks for the excellent example. Smacking my forehead for not thinking of using set -x.

10 points for the quote: "Not always easy to foresee how the shell behaves!" Spoken like a Jedi master!
Occasional Visitor
Jim Kolberg
Posts: 4
Registered: ‎08-23-2004
Message 8 of 8 (229 Views)

Re: To quote or not to quote: shell scripting ?

Discussion above has adaquately covered the topic.
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.