A reader asked me how they could use regex to find all the link labels in a string. Not the links - but the label for the link. It is relatively easy to grab all the matches for a regex in ColdFusion 8, consider the following code block:
<cfsavecontent variable="s">
This is some text. It is true that <a href="http://www.cnn.com">Harry Potter</a> is a good
magician, but the real <a href="http://www.raymondcamden.com">question</a> is how he would stand up
against Godzilla. That is what I want to <a href="http://www.adobe.com">see</a> - a Harry Potter vs Godzilla
grudge match. Harry has his wand, Godzilla has his <a href="http://www.cfsilence.com">breath</a>, it would
be <i>so</i> cool.
</cfsavecontent>
<cfset matches = reMatch("<[aA].?>.?</[aA]>",s)>
<cfdump var="#matches#">
I create a string with a few links in it. I then use the new reMatch function to grab all the matches. My regex says - find all HTML links. It isn't exactly perfect, it won't match a closing A tag that has an extra space in it, but you get the picture. This results in a match of all the links:
But you will notice that the HTML links are still there. How can we get rid of them? I simply looped over the array and did a second pass:
<cfset links = arrayNew(1)>
<cfloop index="a" array="#matches#">
<cfset arrayAppend(links, rereplace(a, "<.*?>","","all"))>
</cfloop>
<cfdump var="#links#">
This gives you the following output:
p.s. Running on ColdFusion 7? Try the reFindAll UDF as a replacement to reMatch.