After my presentation last week I had a few ColdFusion/Solr questions to follow up on. Here are two of them.
- Can you use Solr with content indexed on Amazon S3?
Yes and no. The main answer is no. The code below is what I used to test:
<cfdump var="#files#"> <cfoutput>Indexing #s3dir#<p></cfoutput> <cfindex action="update" collection="indextest1" type="path" key="#s3dir#"
recurse="true" status="result" extensions=".txt,.pdf"> <cfdump var="#result#" label="Result of update operation">
<cfset s3dir = "s3://myaccess:mysecret@s3.coldfusionjedi.com">
<cfdirectory directory="#s3dir#" name="files">
When run, you get: The key specified is not a directory: s3://myaccess:mysecret@s3.coldfusionjedi.com. The path in the key attribute must be a directory when type="path". Obviously "myaccess" and "mysecret" were real values, but nonetheless, this isn't supported. I'm not terribly surprised by this ColdFusion speaks to Solr and asks it to index a folder but in this case the folder is only 'reachable' via ColdFusion. However, you can make use of S3 and Solr indexing. Whenever you move a file to S3, simply run the index operation first. Let Solr index the file and then push it off to S3.
- Can you index a file and a db record together in the same search "row". I know SOLR can handle it if you roll the code manually, but can this be done with the CF tags?
Again - yes and no. The tag that indexes file based data and query based data (cfindex) can only do one type at a time. So with just one tag you couldn't do this. However - if you read and parse the file yourself (for example, using cfpdf to read in the text of a pdf) you can then merge that textual data with any other database data when you add it to the index. I'm not sure how useful this would be. I could see merging file data with database information being stored in the custom fields though.