Using Microsoft Computer Vision API with curl

Microsoft provides a powerful online Computer Vision API which allows to analyze images and provide the JSON data with the following features:
  • Objects on the image with bounding boxes and confidence:
    "rectangle":{"x":25,"y":132,"w":954,"h":776},"object":"fountain","confidence":0.561
  • Tags describing the image:
    "name":"nature","confidence":0.999956488609314
  • Categories with general description:
    "name":"outdoor_water","score":0.9921875
  • Captions with text description of the image:"text":"a large waterfall over a rocky cliff", "confidence":0.91645835840611234
The full list of features is the following:
  • Adult - detects if the image is pornographic in nature (depicts nudity or a sex act). Sexually suggestive content is also detected.
  • Brands - detects various brands within an image, including the approximate location. The Brands argument is only available in English.
  • Categories - categorizes image content according to a taxonomy defined in documentation.
  • Color - determines the accent color, dominant color, and whether an image is black&white.
  • Description - describes the image content with a complete sentence in supported languages.
  • Faces - detects if faces are present. If present, generate coordinates, gender and age.
  • ImageType - detects if image is clipart or a line drawing.
  • Objects - detects various objects within an image, including the approximate location. The Objects argument is only available in English.
  • Tags - tags the image with a detailed list of words related to the image content. 
 
To run it in from browser to process your image:
Here you can upload image and analyze it and copy description from browser manually.


If you need to process many images, you can do it using remote calls to the server.

The simplest way is to use REST API with curl:

(curl is a powerful command line tool for uploading and downloading data from servers using POST method.)
  
Here we describe using it for Windows.

 
- it's free for 7 days and allows to process 5000 images freely with rate 20 images per second.

(To check you key you may use this online console:



Now, create BAT-file with the command:
 
curl -k -H "Ocp-Apim-Subscription-Key: <your_key_here>" -H "Content-Type: application/json" "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze?visualFeatures=Categories,Description,Objects,Tags&details=Landmarks&language=en" -d "{\"url\":\"http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\"}" > 0001.txt

  • -k  here is used to disable secure errorconsidering Microsoft certificate
  • <your_key_here> - please write your key here
  • Categories,Description,Objects,Tags - features to analyze (see above the full list of supported features)
  • http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\ - image to process.
  • > 0001.txt - this create file 0001.txt with result
Running this, you obtain the file 0001.txt:

{"categories":[{"name":"outdoor_water","score":0.9921875,"detail":{"landmarks":[]}}],
"tags":[{"name":"nature","confidence":0.999956488609314},{"name":"water","confidence":0.99656963348388672},{"name":"waterfall","confidence":0.99627882242202759},{"name":"outdoor","confidence":0.99618977308273315},{"name":"rock","confidence":0.97543877363204956},{"name":"mountain","confidence":0.9326665997505188},{"name":"rocky","confidence":0.82317483425140381},{"name":"cascade","confidence":0.60189425945281982},{"name":"hillside","confidence":0.314908504486084}],"description":{"tags":["nature","water","waterfall","outdoor","rock","mountain","rocky","grass","hill", "covered","hillside","standing","side","group","walking","white","man", "large","snow","grazing","forest","slope","herd","river","giraffe","field"], 
"captions":[{"text":"a large waterfall over a rocky cliff", "confidence":0.91645835840611234}]}, 
"objects":[{"rectangle":{"x":25,"y":132,"w":954,"h":776},"object":"fountain","confidence":0.561}],"requestId":"2ed01477-5307-4f27-91cd-e3d948ed3137","metadata":{"width":1280,"height":959,"format":"Jpeg"}}


Now, you can change BAT file to process image sequences (with pause 3 seconds because of limit 20 images per second).

Note: to apply it for your own images, you need to upload the images on some web-hosting. 
I using jino.ru, uploading there files using FTP (Total Commander Ctrl+F) and the use them.

Comments

Popular posts from this blog

Computing ray origin and direction from Model View Projection matrices for raymarching

Forward and backward alpha blending for raymarching

Forward, Deferred and Raytracing rendering in openFrameworks and web