A few days ago a user reported an issue with my blog involving the comment form. Apparently he has an email address using one of the new TLDs (top level domains) that are cropping up, specifically "directory." I decided to do some testing to see how well ColdFusion supports these new TLDs.
First off, it was a bit difficult to find out what has been added recently, but I did find a Wikipedia page with everything listed: ICANN-era generic top-level domains. I knew new TLDs were coming, but my god, I had no idea how many and how... weird some of them were. I mean, I guess it is kind of cool that "blue" is a TLD. But... ok, whatever.
I decided to write a quick test script that would use isValid against some of these new TLDs. I wasn't going to try to type them, just a sample. Here is the script.
<cfscript>
tlds = "com,edu,directory,guru,gift,jobs,international,museum,name,sexy,social,tel,travel,ceo,cheap";
for(i=1; i<=listLen(tlds); i++) {
emailToTest = "foo@foo.#listgetAt(tlds, i)#";
writeoutput("Email: #emailToTest# isValid? #isValid('email',emailToTest)#<br>");
}
</cfscript>
As you can see, it just a simple list of TLDs. I iterate over them, create a test email address, and run isValid against it. Here are the results:
Email: foo@foo.com isValid? YES
Email: foo@foo.edu isValid? YES
Email: foo@foo.directory isValid? NO
Email: foo@foo.guru isValid? YES
Email: foo@foo.gift isValid? YES
Email: foo@foo.jobs isValid? YES
Email: foo@foo.international isValid? NO
Email: foo@foo.museum isValid? YES
Email: foo@foo.name isValid? YES
Email: foo@foo.sexy isValid? YES
Email: foo@foo.social isValid? YES
Email: foo@foo.tel isValid? YES
Email: foo@foo.travel isValid? YES
Email: foo@foo.ceo isValid? YES
Email: foo@foo.cheap isValid? YES
So most of them passed, but a few, like directory and international, did not. I couldn't figure out why until I noticed that both were a bit long. Then I figured it out. ColdFusion was simply checking the length of the TLD. As a test, I tried "abcdefg" as a TLD and it worked. As soon as I tried "abcdefgh", it failed. I'm going to report this as a bug.
As it stands, this blog uses a UDF to check for email validity. (The code began back in ColdFusion 6, so I've got a lot of skeletons in my code closet.) The UDF I'm using uses regular expressions and uses a TLD checker of "Either 2-3 characters or in this hard coded list." Here is the code now:
function isEmail(str) {
return (REFindNoCase("^['_a-z0-9-]+(\.['_a-z0-9-]+)*(\+['_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*\.(([a-z]{2,3})|(aero|asia|biz|cat|coop|info|museum|name|jobs|post|pro|tel|travel|mobi))$",arguments.str) AND len(listGetAt(arguments.str, 1, "@")) LTE 64 AND
len(listGetAt(arguments.str, 2, "@")) LTE 255) IS 1;
}
My thinking is that I'll just modify that first clause in the TLD section to allow for 2 to 30 characters, with 30 being pretty arbitrary. I'm open to suggestions!