Subdomains With Hyphens

I’ve been running a lightweight web crawler for a while just to look for interesting things. Recently I’ve noticed several web sites with hyphens at the beginning or end (or both) of their subdomain names/labels. The first time I saw it, I chalked it up to a link error, but after noticing it a few times it warranted investigation.

 

Here’s an example with both: http://-brainfreeze-.blogspot.com/

 

Obviously Blogspot allows that, but I got to wondering if that’s technically legitimate, and if so, what software accepts or barfs on it? So let’s look at the mess of RFCs that I found… RFC-608, RFC-592, RFC-1033, RFC-1034, RFC-1132, RFC-1738 and RFC-1912 all throw their hands into the host/domain name cookie jar (did I miss any?).

 

RFC-1912, which seems to be the latest, says (emphasis mine):


Allowable characters in a label for a host name are only ASCII letters, digits, and the `-' character. Labels may not be all numbers, but may have a leading digit (e.g., 3com.com). Labels must end and begin only with a letter or digit.

 

So… my interpretation is that that “-brainfreeze-“ and others are in violation of this RFC. But we all know how this really works—it only matters what the browsers support—so what do they do?

 

Not surprisingly, IE, Firefox (Mac & Windows), Safari (Mac) and Opera (Mac) all open the web site without a problem. Safari on the iPhone, however, fails with the error “Safari can't open the page because it can't find the server.”


 


So that leads us outside the browser…

 

Some quick mail tests show Outlook, Thunderbird and Gmail all will accept them in a “To” field. Apache seems to have no issues with them, either.

 

Plesk, a popular web server management package, disallows them, saying the subdomain field has an “improper value.”

 

For programming languages, PERL’s LWP module won’t load them (“bad hostname”), and Ruby’s Hpricot library won’t either (“the scheme http does not accept registry part”).  PHP with include/require fails (“php_network_getaddresses: getaddrinfo failed: Name or service not known”). Python’s urllib2 also spits up on it (“IOError: (-2, 'Name or service not known')”).


 


However, these errors may not be in any particular language or program, but based on underlying name resolution issues. It’s important to note that nslookup on Windows, and nslookup/dig on OS X and CentOS 5.2 don’t have any problems with these names (on the same hosts those languages were tried on).  I’ll try to look at the resolution libraries soon for some of those and post a follow-up.

 For web sites, Archive.org's Wayback Machine gives a vague error if you try to look up our friend "-brainfreeze-" (it differs from a nonexistent host), but Google's cache of it works. Tinyurl doesn't have any issues either. 

So is this a security issue? Probably not directly—but it may be a problem using tools against names that fit this pattern. What happens if your proxy or filter fails to parse the name, what tools rely on a “broken” ping or name lookup before they do something, what tools use urllib2 or LWP under the hood, and what hostname parsing routines will simply say they are invalid and return an error? Any one of those “minor” issues could cause a compliance failure, if not a real security issue.

 

So to recap, what failed in testing?


-          PERL LWP


-          Ruby Hpricot


-          Python urllib2


-          PHP include/require


-          Plesk


-          Safari (iPhone)


-     The Wayback Machine

 

Find others? Post in the comments, and give http://–brainfreeze-.blogspot.com/ some love for being a good test site.

Labels: Firefox| IE| iPhone| Research
Comments
(anon) | ‎12-03-2008 07:06 AM

Pingback from  Subdomains With Hyphens - The HP Security Laboratory

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
Featured


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.