A few months ago I wrote up a quick little demo to help me test multiple image recognition services at once (Testing Multiple Image Recognition Services at Once). This morning I was bored so I thought I'd quickly add a new test to the suite - Amazon Rekognition.
As before, I'm not trying to test out every aspect of the service, but rather hit the high points so it's easy to see - in comparison - how it handles the same input image as other services. Before I get into the Rekognition part, I did some other small tweaks ot my code as well.
Services can now be disabled a bit easier by modifying this part of index.js:
services.push(google.doProcess(theFile, creds.google));
services.push(ibm.doProcess(theFile, creds.ibm));
services.push(microsoft.doProcess(theFile, creds.microsoft));
services.push(amazon.doProcess(theFile, creds.amazon));
It isn't necessarily rocket science, but if you comment out one of the above lines, the test for that service won't be run. The front end will correctly recognize this and simply not render the result. I had to do this as my Microsoft key timed out and I was too lazy to renew it. (Sorry Microsoft, I still love ya!)
So let's talk Rekognition. In general, the most difficult part of this API was setting up the right credentials. That's a common theme with me and AWS, but it feels like it's getting a bit easier now so I think in the future it will be simpler. The other issue I ran into was documentation. The links under the Rekognition Developers page leads you to the JavaScript SDK which is generic for all of AWS. I had trouble finding SDK docs specific for Rekognition until I came across the refernece link: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Rekognition.html. This is the link you want to bookmark.
Another thing to note is that the Rekognition service wants either an S3 path (no surprise there) or the actual bits of the image, up to 5 megs. Here's my complete Amazon tester module.
var request = require('request');
var fs = require('fs');
var AWS = require('aws-sdk');
// http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Rekognition.html
/*
detectFaces
detectLabels
detectModerationLabels
recognizeCelebrities
*/
function doProcess(path, auth) {
var creds = new AWS.Credentials(auth.accessKeyId, auth.secretAccessKey);
var myConfig = new AWS.Config({
credentials: creds, region: auth.region
});
let recog = new AWS.Rekognition(myConfig);
let params = {};
let content = fs.readFileSync(path);
params.Image = {Bytes: content};
let faces = new Promise((resolve, reject) => {
recog.detectFaces(params, function(err, data) {
if(err) reject(err);
resolve(data);
});
});
let labels = new Promise((resolve, reject) => {
recog.detectLabels(params, function(err, data) {
if(err) reject(err);
resolve(data);
});
});
let modlabels = new Promise((resolve, reject) => {
recog.detectModerationLabels(params, function(err, data) {
if(err) reject(err);
resolve(data);
});
});
let celebs = new Promise((resolve, reject) => {
recog.recognizeCelebrities(params, function(err, data) {
if(err) reject(err);
resolve(data);
});
});
return new Promise((resolve, reject) => {
console.log('Attempting Amazon recog');
Promise.all([faces, labels, modlabels, celebs]).then(values => {
let faces = values[0];
let labels = values[1];
let modlabels = values[2];
let celebs = values[3];
let result = {
faces:faces,
labels:labels,
modlabels:modlabels,
celebs:celebs
}
resolve({"amazon":result});
});
});
}
module.exports = { doProcess }
As I said above, I am not attempting to hit all aspects of the API. Instead I focused on 4 main parts:
- finding general labels
- finding faces
- finding things that you may want to moderate (naughty bits)
- finding celebrities
I'm not going to share the 'render' code as it follows the same format as the others. (I do have some thoughts on the front end though - I'll share that at the end.) But let's consider some examples.
First, consider the Captain:
And here are the results:
Faces Note, this report is not showing: BoundingBox, Landmarks (location), or Pose Brightness: 57.73588943481445 Sharpness: 99.98487854003906 Landmarks found: eyeLeft eyeRight nose mouthLeft mouthRight Labels People (confidence: 99.27647399902344) Person (confidence: 99.2764892578125) Human (confidence: 99.27130126953125) Art (confidence: 54.227542877197266) Chair (confidence: 50.68247985839844) Furniture (confidence: 50.68247985839844) Face (confidence: 50.529090881347656) Selfie (confidence: 50.529090881347656) Moderation Labels No moderation labels. Celebrities Note, this report includes uncecognized faces, but I believe it is the same as the Face report so they will not be displayed below. Also, I'm hiding the same information (BoundingBox, etc) for celebs. Name: Patrick Stewart www.imdb.com/name/nm0001772 Brightness: Sharpness: Landmarks found:
Pretty good results if you ask me. I also tried this picture of him:
And it still recognized him as Patrick Stewart. Be sure to note (as I say in the results) that I'm not displaying the specific face location data. That's definitely returned and would help you narrow in on the actual face image (as well as the 'landmarks').
When using a picture of me (found here), it noticed I had a beard, but thought there was a 50% chance I had a cap or hat. It also didn't recognize me as a celebrity, which is technically correct unfortunately.
Given Sinistar...
It had some interesting results:
Labels Flyer (confidence: 95.81148529052734) Poster (confidence: 95.81148529052734) Logo (confidence: 74.65271759033203) Trademark (confidence: 74.65271759033203) American Flag (confidence: 73.34989166259766) Emblem (confidence: 73.34989166259766) Flag (confidence: 73.34989166259766) Brochure (confidence: 70.04080963134766) Badge (confidence: 69.28379821777344) Greeting Card (confidence: 63.669036865234375) Mail (confidence: 63.669036865234375) Art (confidence: 61.1774787902832) Modern Art (confidence: 61.1774787902832) Text (confidence: 51.11092758178711) Label (confidence: 50.73374938964844)
I find the flag label interesting. It definitely has a flag-like aspect.
Anyway - as before, you can find the source code up on GitHub: https://github.com/cfjedimaster/recogtester.
Of course, the best news is that I think I can rewrite the front end in Vue - because that would be even more fun!