Optimize Video Content With Google Cloud Video Intelligence API

SADA Says | Cloud Computing Blog

By SADA Says | Cloud Computing Blog

Is your company getting the most value out of its video libraries? Or, is your video archive sitting idly on a server feeding you fixed costs without generating extra revenue? With the Google Cloud Video Intelligence API, companies can now create smarter user experiences, new revenue opportunities and nuanced value adds at an affordable cost.

In the fierce competition between public cloud providers, Google has always been a step ahead of the rest in ML and AI. The Cloud Video Intelligence API is no different – trained on a vast arsenal of content (ahem, have you heard of Youtube?), the off-the-shelf API uses ML to accurately label objects in video, detect scene changes, transcribe text and filter out explicit content in a snap. Previously herculean tasks of manual labor can now be accomplished in seconds using a simple solution using Cloud Video Intelligence, Cloud Storage, Cloud Pub/Sub and Cloud Functions.

The Use Cases of Video Intelligence – Content Provider or Submission Receiver?

For companies dealing with mountains of user generated video content, the Cloud Video Intelligence API can help a company better understand, query and analyze submitted videos. User testimonials, video comments or reviews can further be transcribed (at a fraction of the cost of manual transcription) and categorized by sentiment using Google’s Natural Language Processing API. In more niche use cases, companies processing contest submissions, casting videos or video resumes can filter out unwanted submissions before passing a selection by human reviewers.

Alternatively, media companies that present content to viewers can provide better recommendations and more timely ad placements for finely tuned user experiences. By harnessing label detection along with video-level annotations, companies can also better accommodate user searches for specific content beyond scant metadata available on the file. Using shot detection, companies can place ads at more timely and less interruptive intervals within a clip, such as at the end of a long continuous segment. What’s more, marketers can insert ads that are relevant to labels within the video.

The MIT Technology Review recently wrote that understanding dynamic actions in video is the next big step for ML software. While the Cloud Video Intelligence API might sound a little like its image counterpart, the Cloud Vision API, this service does go a step further to discern higher level meaning for a scene and the video as a whole. For example, in a video that shows a car (and yes, the API can tell you the make and model of the car) along with a much later frame of a checkered flag, the API will describe the scene as a “race”. This frame-blending style of annotation would not work using the Vision API frame-by-frame on its own. It’s also worth noting that the service detects both nouns and verbs, presenting labels to the end user tiered by confidence level. Google continues to work to expand its label set to create a more nuanced, dynamic understanding of scenes – as evidenced by its recent updated label detection in the summer of 2018.

Processing Video Content is Easy and Serverless with Google Cloud Functions

Like other ML APIs available on Google Cloud, the Cloud Video Intelligence API comes pre-trained, thoroughly documented and available as a REST API. Google’s documentation even allows you to try the API on test videos to get a feel for the type of labels and accuracy the service generates. And if you’re worried about the difficulty of implementation, there is no need to fret. As long as videos are stored on Google Cloud Storage (this is one essential requirement by GCP), a Cloud Pub/Sub message can be sent to a topic every time a new video is uploaded. This can be passed to an extremely lightweight Cloud Functions call to the Video Intelligence API, which then records post-processing JSON output to Google BigQuery for storage and analysis. It really is that easy.

This data can then be consumed in a user-facing app – be it in Data Studio (to give analysts insight into the type of content being uploaded) or a content moderation interface that allows moderators a final call on flagged explicit content. And while currently the Video Intelligence API is offered as a pre-trained API, Google’s recent announcement of AutoML to alpha signals a not-so-distant future where companies can customize their video ML flows.

Many companies view their video data as a liability – unstructured, unexplored and messy. With over 20,000 (and counting) labels available through the Cloud Video Intelligence API, what extra value could your company derive from existing content?

SADA’s team of experts can not only help you migrate your video library to Google Cloud Storage, but also create a content processing plan that strategizes for maximum value add from your video assets.


Our expert teams of consultants, architects, and solutions engineers are ready to help with your bold ambitions, provide you with more information on our services, and answer your technical questions. Contact us today to get started.

Scroll to Top