CFTHREAD - When to join?

May 18, 2009 coldfusion

(This post is more than 2 years old.)

Earlier today I wrote a quick blog entry (CFTHREAD, Names, and Commas) about a bug I had with cfthread names. I mentioned that commas were not allowed in the thread name and that - probably - this was due to the use of the JOIN action allowing for a list of thead names to join together. Tony asked me why someone would use the JOIN action at all.

First off, what happens when you create a thread and don't do anything else?


<cfthread name="find more cowbell">
	<cfset sleep(10000)>
	<cflog file="tdemo" text="All done, baby.">
</cfthread>

<cfdump var="#cfthread#">

In this demo I create a thread. It sleeps for 10 seconds and then writes to a log file. Outside of the thread I dump the cfthread scope. (Everyone knows that exists, right? It gives you metadata about threads created during the request.) Running this we can see that even though the page ended, the thread is still running:

If you check your log files a bit later, you will see that the thread eventually did end and write to the log. In essence, this process is a "Fire and Forget" thread. You start the slow process and don't need to worry about waiting for it to end. A real world example could be starting a slow running stored procedure that performs a database backup.

But what about cases where you do need to wait for a result? Imagine an RSS aggregator. You want to hit N RSS feeds and take each result and add it to a large query. (Oh, I've got a CFC for that if you want something like that.) In this case, you want each thread to handle doing the slow process, but you want to wait for them all to finish before proceeding.

Consider this modified example:


<cfset threadlist = "">
<cfloop index="x" from="1" to="10">
	<cfset name = "find more cowbell #x#">
	<cfset threadlist = listAppend(threadlist, name)>
	<cfthread name="#name#">
		<cfset sleep(10000)>
		<cflog file="tdemo" text="All done with #thread.name#, baby.">
	</cfthread>
</cfloop>

<cfthread action="join" name="#threadlist#" /> <cfdump var="#cfthread#">

In this example, I've added a loop so that I can create 10 threads. Notice that I store the name in a list. This lets me run the join action at the end. Now if you run the file you will notice it takes 10 seconds to run. This represents the JOIN line waiting for the threads to end. Because they run in parallel, I don't have to wait 100 seconds, but just 10. Obviously most real world applications won't have a nice precise timeframe like that. Now if you look at the dump, you can see that they completed:

So the short answer is simply - it depends. If you need to work with the result in the same request, then use the join. If not, don't worry about it.