Way back in January of this year I blogged about a little experiment I did parsing local traffic-related incidents in my home town. A local police department had posted their data in HTML and I used a combination of YQL and ColdFusion to parse it. This was done via a simple scheduled task. A second task turned street addresses into longitude and latitude pairs. Finally I made use of cfmap to display the results. All in all, I think it was pretty cool. The results matched with what I would have assumed were the busiest streets. But yesterday I discovered something cool. Apparently I left the process on and it ran for six months.
For all that time ColdFusion was hitting and scraping the data, cleaning it and importing it into my database. I did a quick SQL to count the number of rows and found I had a bit over 8.5K. I whipped up a quick report template and I thought I'd share the results. Again - this is traffic incident data for Lafayette, LA. A larger city would probably have more. Anyway, first I began with a simple table of general stats:
Next I displayed how many incidents happened per day. No big surprises here - Sunday is the safest day to drive.
The per hour chart is also as you expect - a sharp rise as soon as the afternoon rush hour appears.
Now for the big pie chart. This one displays the amount of incidents per type. My favorite is "Hazardous Situation":
Next I reported on unique street addresses. I didn't imagine I'd see a lot on any particular address (street, yes, actual address, no), but I was pretty surprised:
What's awesome is if you map that top address you see it is right between a school and an insurance company:
Awesome. I then wrote a quick SQL statement that tried to get just the street. It isn't perfect, but this is what I used:
select mid(address, locate(' ', address)+1, length(address)) as thestreet, count( mid(address, locate(' ', address)+1, length(address))) as total
from data
group by mid(address, locate(' ', address)+1, length(address))
order by count(mid(address, locate(' ', address)+1, length(address))) desc
limit 0,10
And here is the result:
Finally, for the heck of it, I used SQL to find the average longitude and latitude of all 8600+ incidents to map out the most dangerous spot in Lafayette. Before anyone says it - yeah - I know that's not really accurate, but it was fun as heck.
All of this was fun - but really unnecessary. Back in January when I first built this, I contacted the police department via their web site to ask about an XML export, but I never heard back.
Edit Thanks to Jason Fisher for the idea and Andrew Powell for the help. I uploaded my data to SpatialKey and created a few heat maps. Their service is amazing. The following thumbnails link to larger images (very large, so click with caution on a mobile device). I just spent a few minutes on this so I'm sure the tool could be used better than what I did, but I'm incredibly impressed by the product.