IBM Watson AI service details

IBM Watson is powerful online AI service, which performs image recognition,text translation and understanding, and text-to-speech and back conversion.

Also, IBM Watson allows to create and use customizable models for their algorithms.
Information below was copied and from IBM Watson site and is given here for quick reference purposes.

Topics:
  1. Visual Recognition
    - Classify an image with the built-in or custom classifiers
    - Detect faces in an image  
  2. Language Translator
    - Translate  
    - Identify language
    - Translate document
  3. Natural Language Classifier
    - Classify a phrase
  4. Natural Language Understanding- Categories
    - Concepts
    - Emotion
    - Entities
    - Keywords - Metadata
    - Relations
    - Semantic roles
    - Sentiment
  5. Personality Insights
    Get profile
  6. Speech to Text
    - Recognize audio
  7. Text to Speech
    - Synthesize audio
    - Get pronunciation
  8. Tone Analyzer
    - Analyze general tone
    - Analyze customer engagement tone

1. Visual Recognition

Quickly and accurately tag, classify and train visual content using machine learning.

Links:

Classify an image with the built-in or custom classifiers

The response includes the classes identified in the image from the built-in General model ("classifier_id": "default") and a confidence score for each class. The score represents a percentage, and higher values represent higher confidences. By default, responses from the Classify calls don't include classes with a score below 0.5 (50%).

classifier_ids :
    • default: Returns classes from thousands of general tags.
    • food: Enhances specificity and accuracy for images of food items.
    • explicit: Evaluates whether the image might be pornographic.

Detect faces in an image

Analyze and get data about faces in images. Responses can include estimated age and gender. This feature uses a built-in model, so no training is necessary. The Detect faces method does not support general biometric facial recognition.
Supported image formats include .gif, .jpg, .png, and .tif. The maximum image size is 10 MB. The minimum recommended pixel density is 32X32 pixels, but the service tends to perform better with images that are at least 224 x 224 pixels.

2. Language Translator

Translate text from one language to another. Take news from across the globe and present it in your language, communicate with your customers in their own language, and more.

Links:

Translate

Translates the input text from the source language to the target language.

Identify language

Identifies the language of the input text.

Translate document

Submit a document for translation. You can submit the document contents in the file parameter, or you can reference a previously submitted document by document ID.

3. Natural Language Classifier

Text classification made easy. Use machine learning to analyze text, and label and organize data into custom categories.
IBM Watson™ Natural Language Classifier uses machine learning algorithms to return the top matching predefined classes for short text input. You create and train a classifier to connect predefined classes to example texts so that the service can apply those classes to new inputs.

Links:

Classify a phrase

Returns label information for the input. The status must be Available before you can use the classifier to classify text.

4. Natural Language Understanding

Natural language processing for advanced text analysis.

Links:

Categories

Returns a five-level taxonomy of the content. The top three categories are returned.

Concepts

Returns high-level concepts in the content. For example, a research paper about deep learning might return the concept, "Artificial Intelligence" although the term is not mentioned.

Emotion 

Detects anger, disgust, fear, joy, or sadness that is conveyed in the content or by the context around target phrases specified in the targets parameter. You can analyze emotion for detected entities with entities.emotion and for keywords with keywords.emotion.

Entities

Identifies people, cities, organizations, and other other entities in the content. See Entity types and subtypes.

Keywords 

Returns important keywords in the content.

Metadata 

Returns information from the document, including author name, title, RSS/ATOM feeds, prominent page image, and publication date. Supports URL and HTML input types only.

Relations 

Recognizes when two entities are related and identifies the type of relation. For example, an awardedTo relation might connect the entities "Nobel Prize" and "Albert Einstein".

Semantic roles 

Parses sentences into subject, action, and object form.

Sentiment

Analyzes the general sentiment of your content or the sentiment toward specific target phrases. You can analyze sentiment for detected entities with entities.sentiment and for keywords with keywords.sentiment .

5. Personality Insights

Predict personality characteristics, needs and values through written text. Understand your customers' habits and preferences on an individual level, and at scale.
The IBM Watson™ Personality Insights service enables applications to derive insights from social media, enterprise data, or other digital communications. The service uses linguistic analytics to infer individuals' intrinsic personality characteristics, including Big Five, Needs, and Values, from digital communications such as email, text messages, tweets, and forum posts.

Links:

Get profile

Generates a personality profile for the author of the input text. The service accepts a maximum of 20 MB of input content, but it requires much less text to produce an accurate profile. The service can analyze text in Arabic, English, Japanese, Korean, or Spanish. It can return its results in a variety of languages.

6. Speech to Text

Easily convert audio and voice into written text for quick understanding of content.
The IBM® Speech to Text service provides APIs that use IBM's speech-recognition capabilities to produce transcripts of spoken audio. The service can transcribe speech from various languages and audio formats. In addition to basic transcription, the service can produce detailed information about many different aspects of the audio. For most languages, the service supports two sampling rates, broadband and narrowband. It returns all JSON response content in the UTF-8 character set.

Links:
Recognize audio

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

7. Text to Speech

Convert written text into natural-sounding audio in a variety of languages and voices.
The IBM® Text to Speech service provides APIs that use IBM's speech-synthesis capabilities to synthesize text into natural-sounding speech in a variety of languages, dialects, and voices. The service supports at least one male or female voice, sometimes both, for each language. The audio is streamed back to the client with minimal delay.

Links:

Synthesize audio

Synthesizes text to audio that is spoken in the specified voice. The service bases its understanding of the language for the input text on the specified voice. Use a voice that matches the language of the input text.
The method accepts a maximum of 8 KB of input, which includes the input text and the URL and headers. The 8 KB limit includes any SSML tags that you specify. The service returns the synthesized audio stream as an array of bytes.

Get pronunciation

Gets the phonetic pronunciation for the specified word. You can request the pronunciation for a specific format. You can also request the pronunciation for a specific voice to see the default translation for the language of that voice or for a specific custom voice model to see the translation for that voice model.

8. Tone Analyzer

Understand emotions and communication style in text.
The IBM Watson™ Tone Analyzer service uses linguistic analysis to detect emotional and language tones in written text. The service can analyze tone at both the document and sentence levels. You can use the service to understand how your written communications are perceived and then to improve the tone of your communications. Businesses can use the service to learn the tone of their customers' communications and to respond to each customer appropriately, or to understand and improve their customer conversations.

Links:

Analyze general tone

Use the general purpose endpoint to analyze the tone of your input content. The service analyzes the content for emotional and language tones. The method always analyzes the tone of the full document; by default, it also analyzes the tone of each individual sentence of the content.
You can submit no more than 128 KB of total input content and no more than 1000 individual sentences in plain text format. The service analyzes the first 1000 sentences for document-level analysis and only the first 100 sentences for sentence-level analysis.

Analyze customer engagement tone

Use the customer engagement endpoint to analyze the tone of customer service and customer support conversations. For each utterance of a conversation, the method reports the most prevalent subset of the following seven tones: sad, frustrated, satisfied, excited, polite, impolite, and sympathetic.
If you submit more than 50 utterances, the service returns a warning for the overall content and analyzes only the first 50 utterances. If you submit a single utterance that contains more than 500 characters, the service returns an error for that utterance and does not analyze the utterance. The request fails if all utterances have more than 500 characters. Per the JSON specification, the default character encoding for JSON content is effectively always UTF-8.

 

Comments

Popular posts from this blog

Computing ray origin and direction from Model View Projection matrices for raymarching

Forward and backward alpha blending for raymarching

Forward, Deferred and Raytracing rendering in openFrameworks and web