So, I was looking into a possible CFINDEX bug reported on the cf-talk list. The user reported something odd when indexing a query that had duplicate keys in it. In case you don't know - when passing data to Verity to index, each record must have a unique key. This could be the primary key field for your database table. Whatever it is - it needs to be unique. I suggested the user try the status attribute. This is a neat new tool in CFMX7 that allows you to see exactly what happens after an index operation. It can tell you, for example, that even though you passed in 9 records, only 7 were indexed. Further investigation of the data would turn up the duplicate keys.
So, I wanted to look into the user's bug report. I created a query that contained duplicate keys. I'll show the code before I tell you what happened.
<cfset q = queryNew("products,score,key")>
<cfloop index="x" from="1" to="5">
<cfset queryAddRow(q)>
<cfset querySetCell(q, "products", "Product #x#")>
<cfset querySetCell(q, "score", randRange(1,100))>
<cfset querySetCell(q,"key",x)>
</cfloop>
<!--- the above is "good" data --->
<!--- now add "bad" data --->
<cfset queryAddRow(q)>
<cfset querySetCell(q, "products", "Product 5 DUPE1")>
<cfset querySetCell(q, "score", randRange(1,100))>
<cfset querySetCell(q,"key",5)>
<cfdump var="#q#">
<cfindex query="q"
collection="test1"
action="update"
type="custom"
key="key"
title="products"
body="products,score"
status="status"
>
<cfdump var="#status#">
<cfsearch collection="test1" status="search_status" criteria="*" name="results">
<cfdump var="#results#">
<cfdump var="#search_status#">
Don't run this code. Just look at. Looking at the title of this post, try to guess what happened. Take your time. I'll be waiting here.
Back already? If you do actually run this code, you will see that six records get inserted, not five as you may think. Why? Notice the dump of the search, specifically notice the key value:
We have a value for both "5" and "5.0". For some reason, the insert done in the loop was a numeric value and the value done manually was a string. This is hard to debug since a dump of the query shows "5" for both.
However - this is an excellent time to remind folks that when they create queries as I did, with queryNew, you can now provide metadata about the query. If I change my queryNew to this:
queryNew("products,score,key","varchar,integer,integer")
The code works as expected. In general, I try to be as anal as possible when it comes to my code. It's times like this when it actually helps!