Forgive the overly dramatic blog title - just having a bit of fun. A coworker asked me yesterday how one could take a flat XML file and serve it up as JSON. I told him I'd whip up a "quick example" for him and blog it to share with others. Smirking a bit to myself, I imagined I'd be done in 5 minutes and it would be a bit of a fluff piece. My ego ran head first into a brick wall rather quickly - which - I will admit - was kind of fun. Here's what I discovered.
For the hell of it, I thought, why not simply read in the XML, parse it, and serialize it?
<cfset xmlFile = expandPath("./Applications.xml")>
<cfset xmlData = xmlParse(xmlFile)>
<cfset jsonData = serializeJSON(xmlData)>
<cfoutput>#jsonData#</cfoutput>
Looks simple enough, right? However, it appears that serializeJSON takes the 'toString' version of xmlData. Even though it's a ColdFusion XML variable, it's first converted to a string and then passed to serializeJSON. So basically you get a large JSON encoded string. Or as I call it - a Chrome killer. (To be fair, Chrome was nice about shutting down a tab, and it was more the fact that I have a plugin that recognizes and auto formats JSON strings.) So that's not good. What about converting the XML into native ColdFusion structure?
Turns out there is a UDF for that - xmlToJson. It makes use of a XSLT transformation to create JSON. Perfect! And appropriately geeky too, right? I mean, how often do you get a chance to tell folks you performed an XSLT transformation today? Try it at the bar next time. It's a sure win. Here's the template I created to test this (note, the UDF is rather large, so I cut out the innards):
<cfset xmlFile = expandPath("./Applications.xml")>
<cfset xmlData = xmlParse(xmlFile)>
<cffunction name="xmlToJson" output="false" returntype="any" hint="convert xml to JSON">
</cffunction>
<cfset json = xmlToJson(xmlData)>
<cfdump var="#deserializeJSON(json)#">
This worked rather fast - shockingly fast actually - but it throws an error when deserializing. The transformation encountered this - Value=".0". This created a JSON string something like this - "Value":.0. That's not valid JSON. In theory I could have done some string parsing, but that felt dirty, so I went to approach 3: manually creating a structure. I designed this UDF:
function xmlToStruct(xml x) {
var s = {};
if(xmlGetNodeType(x) == "DOCUMENT_NODE") {
s[structKeyList(x)] = xmlToStruct(x[structKeyList(x)]);
}
if(structKeyExists(x, "xmlAttributes") && !structIsEmpty(x.xmlAttributes)) {
s.attributes = {};
for(var item in x.xmlAttributes) {
s.attributes[item] = x.xmlAttributes[item];
}
}
if(structKeyExists(x, "xmlChildren")) {
for(var i=1; i<=arrayLen(x.xmlChildren); i++) {
if(structKeyExists(s, x.xmlchildren[i].xmlname)) {
if(!isArray(s[x.xmlChildren[i].xmlname])) {
var temp = s[x.xmlchildren[i].xmlname];
s[x.xmlchildren[i].xmlname] = [temp];
}
arrayAppend(s[x.xmlchildren[i].xmlname], xmlToStruct(x.xmlChildren[i]));
} else {
s[x.xmlChildren[i].xmlName] = xmlToStruct(x.xmlChildren[i]);
}
}
}
return s;
}
It handles creating a substructure for xml attributes and handling a case where you have 2-N xml children of the same name. (That was the toughest part.) Here's the complete template:
<cfscript>
function xmlToStruct(xml x) {
var s = {}; if(xmlGetNodeType(x) == "DOCUMENT_NODE") {
s[structKeyList(x)] = xmlToStruct(x[structKeyList(x)]);
} if(structKeyExists(x, "xmlAttributes") && !structIsEmpty(x.xmlAttributes)) {
s.attributes = {};
for(var item in x.xmlAttributes) {
s.attributes[item] = x.xmlAttributes[item];
}
} if(structKeyExists(x, "xmlChildren")) {
for(var i=1; i<=arrayLen(x.xmlChildren); i++) {
if(structKeyExists(s, x.xmlchildren[i].xmlname)) {
if(!isArray(s[x.xmlChildren[i].xmlname])) {
var temp = s[x.xmlchildren[i].xmlname];
s[x.xmlchildren[i].xmlname] = [temp];
}
arrayAppend(s[x.xmlchildren[i].xmlname], xmlToStruct(x.xmlChildren[i]));
} else {
s[x.xmlChildren[i].xmlName] = xmlToStruct(x.xmlChildren[i]);
}
}
} return s;
} s = xmlToStruct(xmlData); </cfscript> <cfcontent reset="true" type="application/json"><cfoutput>#serializeJSON(s)#</cfoutput>
<cfset xmlFile = expandPath("./Applications.xml")>
<cfset xmlData = xmlParse(xmlFile)>
This worked rather fast (2-3 seconds for a 2 meg XML file), but we could make it even faster by cutting out the file read and parsing after we've done it one time:
<cfset xmlFile = expandPath("./Applications.xml")>
<cfset xmlData = xmlParse(xmlFile)> <cfscript>
function xmlToStruct(xml x) {
var s = {}; if(xmlGetNodeType(x) == "DOCUMENT_NODE") {
s[structKeyList(x)] = xmlToStruct(x[structKeyList(x)]);
} if(structKeyExists(x, "xmlAttributes") && !structIsEmpty(x.xmlAttributes)) {
s.attributes = {};
for(var item in x.xmlAttributes) {
s.attributes[item] = x.xmlAttributes[item];
}
} if(structKeyExists(x, "xmlChildren")) {
for(var i=1; i<=arrayLen(x.xmlChildren); i++) {
if(structKeyExists(s, x.xmlchildren[i].xmlname)) {
if(!isArray(s[x.xmlChildren[i].xmlname])) {
var temp = s[x.xmlchildren[i].xmlname];
s[x.xmlchildren[i].xmlname] = [temp];
}
arrayAppend(s[x.xmlchildren[i].xmlname], xmlToStruct(x.xmlChildren[i]));
} else {
s[x.xmlChildren[i].xmlName] = xmlToStruct(x.xmlChildren[i]);
}
}
} return s;
} cachedJSON = serializeJSON(xmlToStruct(xmlData)); </cfscript> <cfset cachePut("jsonstr", cachedJSON)> </cfif> <cfcontent reset="true" type="application/json"><cfoutput>#cachedJSON#</cfoutput>
<cfset cachedJSON = cacheGet("jsonstr")>
<cfif isNull(cachedJSON)>
This version makes use of ColdFusion 9's built in caching to store the JSON string. On the first hit it takes about 3 seconds to render in the browser. After that it's almost immediate. The cache has no expiration, but it could be updated to timeout after a certain amount of time, or, you could note the date stamp on the file and store both that and the filename as a key to your cache.