Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
April 11 2018 - Checking Out Microsoft Cognitive Services

Checking Out Microsoft Cognitive Services with PHP

AI? Huh?? Oh.... yeah.. I've heard of that! That's that movie where the Skynet becomes sentient and attempts to destroy mankind, right? Scary stuff.

What's that you say? Artificial intelligence won't kill me?! Huh.. OK - well, maybe I'll stop smoking a pack of cigarettes a day then. I kid.. I don't smoke - sorry if you do (if you are still alive and able to read this).

I was listening to the Shoptalk Show podcast the other day and heard about this crazy new(?) thing that Microsoft and some of the other big data players are cooking up.

Cognitive services is a form of artificial intelligence because you have to train it first on a controlled dataset and then it gets smarter and more accurate based on its previous experiences.

Naturally, I was just curious enough to see if I could get some semi-decent data out of the thing. Here's what I did.

Really Really Really Simple Example in PHP

First off, choose which service you'd like to check out. There are lots. I was interested in the Vision and Face Verification APIs.

I don't want to share my API keys with you so you'll have to go and get your own (for each cognitive API that you'd like to checkout). For the Microsoft Vision API, poke your nose around here: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/vision-api-how-to-topics/howtosubscribe

Once you have your API keys setup, you can view all of them at this URL: https://azure.microsoft.com/en-us/try/cognitive-services/my-apis/

I was a little confused about how to get setup with the keys but, with enough poking around, I was able to acquire some 😄. If there is an easier way that I just don't know about, feel free to comment below.

Note that your trial API keys will expire after 30 days. (April 2018) After that, you'll have to sign up but the costs are pretty low.

Vision API Example in PHP

Let's start with an image on the Internet. Just pick one. Any one... I'll wait... Actually.. um, how about this one:

a man and a woman posing for a photo

Cool, ok.. Let's create a script that will feed that image to the MS Cognitive Services!

<?php

$body = '{"url": "https://static1.squarespace.com/static/5a149cf7e9bfdfa73ac97b97/5a3efb13652dea3131f3f32e/5a3efb43f9619a3098b96602/1514076997349/MenageATrois_MG_6660.jpg?format=500w"}';

// Create a stream
$opts = [
  "http" => [
    "method" => "POST",
    "header" => "Ocp-Apim-Subscription-Key: YOUR_API_KEY_GOES_HERE\r\n" .
      "Content-Type: application/json\r\n",
    "content" => $body,
    sprintf('Content-Length: %d', strlen($body))
  ]
];

$context = stream_context_create($opts);

// Open the JSON response using the HTTP headers set above
$json = file_get_contents('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Categories,Tags,Description,Faces,Adult&language=en', false, $context);

echo $json.PHP_EOL;

If we save that script with a name like example.php, we can then call it from the command line.

php example.php

Which returns JSON from Cognitive Services with pretty accurate information about our photo!

{
  "categories": [
    { "name": "others_", "score": 0.00390625 },
    { "name": "people_", "score": 0.69140625, "detail": { "celebrities": [] } }
  ],
  "tags": [
    { "name": "person", "confidence": 0.99869954586029053 },
    {
      "name": "standing",
      "confidence": 0.92608910799026489
    },
    { "name": "posing", "confidence": 0.86872708797454834 }
  ],
  "description": {
    "tags": [
      "person",
      "standing",
      "posing",
      "photo",
      "woman",
      "man",
      "white",
      "black",
      "couple",
      "young",
      "dress",
      "wedding",
      "holding",
      "wearing",
      "cake",
      "people",
      "group",
      "room"
    ],
    "captions": [
      {
        "text": "a man and a woman posing for a photo",
        "confidence": 0.88769173848231653
      }
    ]
  },
  "faces": [
    {
      "age": 21,
      "gender": "Male",
      "faceRectangle": { "top": 94, "left": 240, "width": 82, "height": 82 }
    },
    {
      "age": 27,
      "gender": "Female",
      "faceRectangle": { "top": 114, "left": 317, "width": 80, "height": 80 }
    }
  ],
  "adult": {
    "isAdultContent": false,
    "adultScore": 0.19785866141319275,
    "isRacyContent": false,
    "racyScore": 0.20760928094387054
  },
  "requestId": "edadd256-4a47-4717-9e08-fbb3d174658c",
  "metadata": { "height": 667, "width": 500, "format": "Jpeg" }
}

Yum yum yum!!! Look at all that tasty data! For example the "tags" section contains: person, standing, and posing and the confidence is very high for each of those tags 😄

If we look at description.tags I would say that those are all pretty good suggestions also.

The captions.text value could be used to create automatic alt attributes in your HTML. Just a thought for a practical use case of this tech.

One potential flaw we can see here is the faces section only contains 2 records. This is strange because we have 3 faces in our example photograph. Also the female's age has been guessed a little high at 27.

Now just close your eyes and picture that scene in The Terminator where Arnold is looking for Sarah Connor and he scans people's faces and other pieces of the real world and the data values come up on his internal HUD display like:

Gender: Female
Age: 27
Height: 170cm
Target match: false

In the future the Skynet could use this technology against us but for the time being, I think it's pretty cool.

Links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.