Node.js - Setting Up & Using Google Cloud Video Intelligence API

Do you need to annotate the content of a video automatically? Let's say you have a service that allows users to upload videos and you want to know content of each videos. It would take a lot of time and efforts if the process is done manually by watching each videos. Fortunately there are some services that can annotate videos and extract metadata. By doing so, it becomes possible to make the videos searchable.

If you need that kind of service, you can consider Google Cloud Video Intelligence. It works by labelling a video with multiple labels using their library of 20,000 labels. It has the capability to extract metadata for indexing your video content, so that you can easily organize and search video content. Other features include shot detection to distinguish scene changes and integration with Google Cloud Storage.

Preparation

1. Create or select a Google Cloud project

A Google Cloud project is required to use this service. Open Google Cloud console, then create a new project or select existing project

2. Enable billing for the project

Like other cloud platforms, Google requires you to enable billing for your project. If you haven't set up billing, open billing page.

3. Enable Video Intelligence API

To use an API, you must enable it first. Open this page to enable Video Intelligence API.

4. Set up service account for authentication

As for authentication, you need to create a new service account. Create a new one on the service account management page and download the credentials, or you can use your already created service account.

In your .env file, you have to add a new variable because it's needed by the library we are going to use.

GOOGLE_APPLICATION_CREDENTIALS=/path/to/the/credentials

In addition, you also need to add GOOGLE_CLOUD_PROJECT_ID to your .env as well.

Dependencies

This tutorial uses @google-cloud/video-intelligence and also dotenv for loading environment. Add the following dependencies to your package.json and run npm install

  "@google-cloud/video-intelligence": "~1.5.0"
  "dotenv": "~4.0.0"

Code

Below is the code example of how to annotate video with Google Video Intelligence API. The video that will be analyzed needs to be uploaded to Google Cloud Storage first. You can read our tutorial about how to upload file to Google Cloud Storage using Node.js.

  // Loads environment variables
  require('dotenv').config();
  
  // Imports the Google Cloud Video Intelligence library
  const videoIntelligence = require('@google-cloud/video-intelligence');
  
  // Creates a client
  const client = new videoIntelligence.VideoIntelligenceServiceClient({
    projectId: process.env.GOOGLE_CLOUD_PROJECT_ID,
  });
  
  // URI of the video you want to analyze
  const gcsUri = 'gs://{YOUR_BUCKET_NAME}/{PATH_TO_FILE}';
  
  // Request config
  const request = {
    inputUri: gcsUri,
    features: ['LABEL_DETECTION'],
  };
  
  // Execute request
  client
    .annotateVideo(request)
    .then(results => {
      console.log('Waiting for service to analyze the video. This may take a few minutes.');
  
      return results[0].promise();
    })
    .then(results => {
      console.log(JSON.stringify(results, null, 2));
      // Gets annotations for video
      const annotations = results[0].annotationResults[0];
  
      // Gets labels for video from its annotations
      const labels = annotations.segmentLabelAnnotations;
      labels.forEach(label => {
        console.log(`Label ${label.entity.description} occurs at:`);
        label.segments.forEach(segment => {
          const _segment = segment.segment;
  
          _segment.startTimeOffset.seconds = _segment.startTimeOffset.seconds || 0;
          _segment.startTimeOffset.nanos = _segment.startTimeOffset.nanos || 0;
          _segment.endTimeOffset.seconds = _segment.endTimeOffset.seconds || 0;
          _segment.endTimeOffset.nanos = _segment.endTimeOffset.nanos || 0;
  
          console.log(
            `\tStart: ${_segment.startTimeOffset.seconds}` +
              `.${(_segment.startTimeOffset.nanos / 1e6).toFixed(0)}s`
          );
  
          console.log(
            `\tEnd: ${_segment.endTimeOffset.seconds}.` +
              `${(_segment.endTimeOffset.nanos / 1e6).toFixed(0)}s`
          );
  
          console.log(`Confidence level: ${segment.confidence}`);
        });
      });
    })
    .catch(err => {
      console.error(`ERROR: ${err}`);
    });

.

It may take a few minutes to get the annotation results depending on video length. annotateVideo returns a promise of array which first element has promise(). So you need to wait until the process is done by calling results[0].promise(). Meanwhile, you can add a console.log to show that the annotation is in progress.

Below is the result format. It's an array of 3 objects. The first object contains the annotation results - this is what we need to parse in order to understand what is the video about. The second object contains progress percentage, execution start time, and the last time the progress is updated.

  [  
     {  
        "annotationResults":[  
           { }
        ]
     },
     {  
        "annotationProgress":[  
           {  
              "inputUri":"/{YOUR_BUCKET_NAME}/{PATH_TO_FILE}",
              "progressPercent":100,
              "startTime":{  
                 "seconds":"1546439976",
                 "nanos":559663000
              },
              "updateTime":{  
                 "seconds":"1546440001",
                 "nanos":104220000
              }
           }
        ]
     },
     { }
  ]

annotationResults is an array whose elements look like this

  {
            "entity": {
              "entityId": "/m/01350r",
              "description": "performance art",
              "languageCode": "en-US"
            },
            "categoryEntities": [
              {
                "entityId": "/m/02jjt",
                "description": "entertainment",
                "languageCode": "en-US"
              }
            ],
            "segments": [
              {
                "segment": {
                  "startTimeOffset": {},
                  "endTimeOffset": {
                    "seconds": "269",
                    "nanos": 720000000
                  }
                },
                "confidence": 0.666665256023407
              }
            ]
          },

Each object in annotationResults represent a label along with video segments that strengthen the reason why the label is given. There is also a value that shows how confidence the service gives a label to a segment.

That's how to use Google Cloud Intelligence in Node.js. If you want to analyze images, you can read the tutorial about how to use Google Cloud Vision in Node.js