For a while now I've made use of a service called Twilert. The site has one simple purpose. It allows you to create Twitter search profiles and generate an email report to you daily (or weekly, etc). I thought it might be interesting to look at how difficult this would be to build in ColdFusion. Luckily Twitter goes a long way to providing both a simple to use API and a very powerful API as well. Here's what I came up with - and hopefully this can be useful to others.
First - let me define what I want to build. Like the Twilert service, I'll start with a set of search terms. I'll perform my search daily via a scheduled task that runs right past midnight and then delivers the report to me via email. The Twitter API is very nicely documented. In particular, the Search API is the one we care about. Also of note are the rate limits Twitter applies. While my code won't hit that limit, it is something to keep in mind. I'd suggest spending a few minutes scanning all of the previous links to get a feel for the Twitter API and what is supports. Now that you've done done (ok, be honest, if you are like me, you probably decided to skip it and read it later), let's start to build out our report generator.
First, the search term. This could be dynamic, perhaps based on the URL, which would then make it easy to set up a few scheduled tasks, each with different values. For now though I just hard coded it:
<!--- Search terms, max 140, minus date portion --->
<cfset search = "coldfusion">
Twitter supports basic AND/OR style searches as well. But I'll keep it simple and just one word. Now, I mentioned the rate limits before. Another thing to note is that when you perform a search, you can only return 100 results at one time. Twitter supports a Page attribute, but they limit you to 15 pages. That's 1500 results which seems a bit much, especially for an email. I created a variable to represent the total number of network requests, or pages, of data to get:
<!--- Max number of HTTP requests --->
<cfset maxRequests = 10>
For the most part, this is pretty arbitrary. If I got an email with 1000 results in it I doubt I'd read past the first twenty or so. Obviously this is something you can change to your liking, within the limits of Twitter's API.
<!--- current page --->
<cfset page = 1>
<!--- max results per page is 100 --->
<cfset max = 100>
The page variable just tracks the current page and max will be sent to Twitter to request the maximum amount of results possible.
<!--- Loop until we run out of results or hit maxRequests. Use a simple boolean to check both --->
<cfset done = false>
<!--- A flag to see if something went wrong. --->
<cfset errorFlag = false>
<!--- A flag to determine if we maxed out our search --->
<cfset maxFlag = false>
These three variables are just flags. I'll be using the done variable in a loop coming up. The errorFlag will notice if something goes wrong with one of the HTTP calls. The maxFlag will be used if we hit the maximum number of requests.
<!--- append yesterdays date to the search url --->
<cfset yesterday = dateAdd("d", -1, now())>
<cfset searchURL = search & " since:#dateFormat(yesterday,'yyyy-mm-dd')#">
<cfset searchURL = urlEncodedFormat(search)>
Next up we add the date filter to our search terms. Remember I'm running this every day so I want to limit the results to entries from yesterday. This is done with the since operator. Twitter also supports an until operator, but as I plan on running this report right past midnight, it won't matter. (You can see a good report of all the operators here.)
<cfset results = []>
The last bit of code before we actually begin to search is to create the array that will store our results. Ok - so everything so far was setup - now let's look at the actual search:
<cfloop condition="not done">
<cfhttp url="http://search.twitter.com/search.json?page=#page#&rpp=#max#&q=#searchURL#" result="result">
<cfif result.responseheader.status_code is "200">
<cfset content = result.fileContent.toString()>
<cfset data = deserializeJSON(content)>
<cfloop index="item" array="#data.results#">
<cfset arrayAppend(results, item)>
</cfloop>
<cfif structKeyExists(data, "next_page")>
<cfset page++>
<cfif page gt maxRequests>
<cfset maxFlag = true>
<cfset done = true>
</cfif>
<cfelse>
<cfset done = true>
</cfif>
<cfelse>
<cfset errorFlag = true>
<cfset done = true>
</cfif>
</cfloop>
Ok, let me describe this line by line. The loop will continue until the done variable is true. In each iteration I use cfhttp to hit Twitter. Notice that I ask for JSON back, pass in both page and max, and pass in my search query.
If the result status is 200, it should be good. I get the content and deserialize the JSON. I loop through each result and simply append it to the global results array. If the result JSON contains a next_page value, then more data exists. I do a check first though to see that I've not made too many requests. Lastly, I've got an ELSE block for times when the status wasn't 200. I could add additional logging here, but for now I just use the simple error flag.
Now that we have results, let's begin the display portion:
<!--- prepare result --->
<cfsavecontent variable="report">
<cfoutput>
<style>
h2, p, .twit_date { font-family: Verdana, Geneva, Arial, Helvetica, sans-serif; }
.twit_date { font-size: 10px; }
.twit_odd {
padding: 10px;
}
.twit_even {
padding: 10px;
background-color: ##f0f0f0;
}
</style>
I've begun my display with a cfsavecontent. The reason for this is that I considered also generating a PDF report as well. I didn't end up doing it, but since I'll have my report in a nice variable, I'll be able to do just about anything with it. I then put on my designer hat (it has stars on it) and whipped up some simple CSS I'll use later. Please feel free to send suggestions on nicer CSS.
<h2>Twitter Search Results</h2>
<p>
The following report was generated for the search term(s): #search#.<br/>
It contains matches found from <b>#dateFormat(yesterday,"mmmm dd, yyyy")#</b> to now.<br/>
A total of <b>#arrayLen(results)#</b> result(s) were found.<br/>
<cfif maxFlag><b>Note: The maximumum number of results were found. More may be available.</b><br/></cfif>
<cfif errorFlag><b>Note: An error ocurred during the report.</b><br/></cfif>
</p>
Next up is a simple header. I report on the search term, the date, number of results, and on my flags.
<cfloop index="x" from="1" to="#arrayLen(results)#">
<cfset twit = results[x]>
<cfif x mod 2 is 0>
<cfset class = "twit_even">
<cfelse>
<cfset class = "twit_odd">
</cfif>
<!--- massage date a bit to remove +XXXX --->
<cfset twitdate = twit.created_at>
<cfset twitdate = listDeleteAt(twitdate, listLen(twitdate, " "), " ")>
<p class="#class#">
<img src="#twit.profile_image_url#" align="left">
<a href="http://twitter.com/#twit.from_user#">#twit.from_user#</a> #twit.text#<br/>
<span class="twit_date">#twitdate#</span>
<br clear="left">
</p>
</cfloop>
Now I loop over each Twit. Twitter reports a variety of fields for each result. I decided to only care about the time, the user (and his or her profile image), and the text. Please keep in mind though that there is even more information in the results. This is what I decided was important. The display is rather simple. Profile picture to the left, name and text on top, and the formatted date below it. (FYI: Notice the x mod 2 if clause there? I actually had the ColdFusion 9 ternary clause first and it was a lot slimmer. I know I could switch it to IIF but I hate that function.)
</cfoutput>
</cfsavecontent>
<cfoutput>#report#</cfoutput>
The final bits simply close up our tags and then output to screen. So I did lie a bit - I don't actually email the report, but as you can imagine, that would take about two seconds. I'd just wrap the report result in cfmail tags. I've got a few ideas on how to make this report even slicker. That will be in the next entry. So - is this useful? I could imagine this being a great way for a business to automate monitoring of their name and products.
You can download the full bits below.