Using the new Bluemix Visual Recognition service in Cordova

February 6, 2015 development javascript mobile bluemix

(This post is more than 2 years old.)

Before I begin, a quick disclaimer. I've been at IBM for a grand total of five days. Considering three were taken up by travel and orientation, I'm very much the new kid on the block here. I've only begun to look into MobileFirst and Bluemix so you should take what I show here with the same confidence you would give anyone using a new technology for two days. In other words - proceed with caution! ;)

Bluemix is a cloud platform that offers both Platform as a Service (PaaS) and MBaaS (Mobile Backend as a Service). The PaaS offerings let you deploy web applications using a variety of engines, including Node. I'm currently use AppFog as my PaaS for a few sites so I'm already a bit familiar with the concept. The MBaaS side are services that integrate with your web or mobile apps. These include things like NoSQL datastorage and access to IBM Watson.

Yesterday we announced five new services that make use of the Watson. They include: Speech to Text and Text to Speech, Visual Recognition, Concept Insights, and Tradeoff Analytics.

As soon as I saw the Visual Recognition API (demo link) I thought - this would be cool in Cordova!

One thing I wasn't sure of though was how to use the service by itself, and not with a particular application. I checked the docs and came across this Enabling external applications and third-party tools to use Bluemix services:

You might have applications that were created and run outside of Bluemix, or you might use third-party tools. If Bluemix services provide endpoints that are accessible from the internet, you can use those services with your local apps or third-party tools.
To enable an external application or third-party tool to use a Bluemix service, complete the following steps:

Request an instance of the service for an existing Bluemix application. The credentials and connection parameters for the service are created. For more information about requesting a service instance, see Requesting a new service instance.
Retrieve the credentials and the connection parameters of the service instance from the VCAP_SERVICES environment variable of the Bluemix application.
Specify the credentials and the connection parameters in your external application or third-party tool.

My understanding of that was that I needed to make a new Node application and simply create a view that would dump out VCAP_SERVICES. I wished it were easier, but I figured if I did this once, I could reuse the same Node code in the future when I needed to test out another service. Turns out I didn't need to do all those steps. This next section will focus on what you have to do on the Bluemix side to prepare the service. I'll then switch to the Cordova side.

After signing up for Bluemix (you can get a free, 30 day trial), you want to first create an application. You won't actually be using the application, but it is necessary to get the proper credentials to use the service.

On the next screen I selected Web. This may be something I change next time I do this.

And then I selected the Node option. Since I really wasn't using the app, I probably could have selected "I Have Code Already."

For the application name, pick anything. If you pick "RayIsTheBestNewIBMer" you get double the time for your free trial. (Note - the preceding statement may not be exactly true.)

On the next screen, select Add a Service:

And then - obviously - select the Visual Recognition service. Note the beta label. Results may vary. Yada, yada, yada.

Since you added this service from an app, it will be automatically selected in the next screen. (You may not have a "Space" though. I made a few while testing and I don't remember if it is required or not.)

You'll get a warning about needing to restage the application, but since you don't have anything there anyway you can just go ahead and let it restart the app. Woot - almost done. Now we need the authentication and API info. Back on the app dashboard, note there is a link to show credentials for the service:

Clicking that will expose properties for the service including the ones we care about: url, username, and password:

Ok, we've got what we came for, let's talk about the Cordova side. I began by creating a simple application that would make use of the Camera API. I created a web page with two buttons - one to source from the device camera and one from the photo gallery. Once I had the image file, I would then make use of the File Transfer plugin to post to the API.

Documentation for the Visual Recognition API may be found here: http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/apis/#!/visual-recognition/. I focused my attention on the POST call to /v1/tag/recognize. I saw there that I needed to send the image with a file name of img_file. Let's take a look at the code (this portion and the rest may be found in the Github URL I'll share towards the end):



var API_URL = "https://gateway.watsonplatform.net/visual-recognition-beta/api";
var API_USER = "supersecret";
var API_PASSWORD = "anothersecret";

$(document).on("deviceready", function() {

    function uploadWin(res) {
        var data = JSON.parse(res.response);
        var labels = data.images[0].labels;
        var result = "<p>Detected the following possible items:<br/>";
        for(var i=0, len=labels.length; i<len; i++) {
            result += "<b>"+labels[i].label_name + "</b><br/>";   
        }
        $("#status").html(result);
    }
    
    function uploadFail() {
        console.log('uploadFail');
        console.dir(arguments);
    }
    
    //Credit: http://stackoverflow.com/a/14313052/52160
    function authHeaderValue(username, password) {
        var tok = username + ':' + password;
        var hash = btoa(tok);
        return "Basic " + hash;
    };
    
    function onCamSuccess(imageData) {
		var image = document.getElementById('myImage');
        $("#imgDisplay").attr("src", imageData);

        $("#status").html("<i>Uploading picture for BlueMix analysis...</i>");
        
        var options = new FileUploadOptions();
        options.fileKey = "img_file";
        options.fileName = imageData.substr(imageData.lastIndexOf('/') + 1);
        
        options.headers = {'Authorization': authHeaderValue(API_USER, API_PASSWORD) };
        
        var ft = new FileTransfer();
        ft.upload(imageData, encodeURI(API_URL+"/v1/tag/recognize"), uploadWin, uploadFail, options);

    }

    function onCamFail(message) {
	alert('Failed because: ' + message);
    }	
    
    //Touch handlers for the two buttons, one uses lib, one uses cam
    $("#cameraButton, #galleryButton").on("touchend", function() {
        var source = ($(this).prop("id")==="cameraButton")?Camera.PictureSourceType.CAMERA:Camera.PictureSourceType.PHOTOLIBRARY;
        
		navigator.camera.getPicture(onCamSuccess, onCamFail, { 
			quality: 50,
			sourceType: source,
			destinationType: Camera.DestinationType.FILE_URI
		});

    });
    
});

I assume the camera usage is not necessarily new for my readers. All I do "fancy" here is switch my source based on the button clicked. As soon as I have the image I render it on the app so the user can see it. I then begin my file transfer. I use the URL I got from the service and ensure I include my authorization info. I'm using fake values in the code above of course. (More on that in a minute.)

Once done, I take the result and simply output the labels to the app. The service returns labels with an underscore between words and that could be cleaned up, but I didn't really bother. Here are a few samples. Of course, the results are not perfect, but close, and can be improved. First, a scary picture of myself.

I'm not quite sure where Combat Sport came from, but I'm all about the combat sport. Meat Eater is also dead on too. On a serious note, human, indoors and person view were perfect.

Each label returns a score and you could use that to filter out items that appear to be too low. I'm not doing that in this demo but that would help improve the results shown to the user. Ok, another test.

Personally I think toothed whale is pretty damn funny. It didn't recognize shark, but it got close, which I think is pretty good.

Another option the API supports is including labels to narrow down the search. One could use that to help direct the results as well.

So - the obvious issue we have with this demo is that the username and password are available in the source code. One way around that would be to actually use the Node application and have it work as a proxy. That would let me log, monitor, restrict to authorized users, etc.

I've put the source code for this up on my Github repo of Cordova demos. You can find it here: https://github.com/cfjedimaster/Cordova-Examples/tree/master/imagerecognitionbluemix