Tech & Science

Google’s AI can now caption images almost as well as humans

James Walker

Published

September 23, 2016

Google’s image captioning software is part of its wider TensorFlow machine learning kit. Today, it announced a new release of the algorithm that comes with substantially improved performance. It is able to make more accurate descriptions that include more detail, enabling it to caption images with a standard as high as humans.
In a blog post, Google provided some examples of images captioned by the new algorithm. They include “A person on a beach flying a kite” and “A man riding a wave on top of a surfboard.” The company also highlighted the accuracy improvements made since the last-generation technology. It now captions “A brown bear is swimming in the water” as “Two brown bears sitting on top of rocks,” a description that is more applicable to the contents of the image.
Google has made the improvements by switching to the Inception V3 model for the image encoder. This gives the image captioning system an improved ability to recognise individual objects with images. In turn, this directly facilitates more detailed descriptions. The Inception V3 model achieves 93.9 percent accuracy on the ImageNet classification task.

Google’s AI image captioning software

Google

The image model has also been fine-tuned to emphasise describing images rather than classifying them. Inception V3 is primarily aimed at grouping images into categories, rather than describing their contents. Google has optimised the model so it can turn “a dog, grass and a frisbee are in the image” into a natural language response including the colour of the grass and the position of the dog relative to the frisbee.
The final significant improvement has been made to the training process. Google trains the algorithm by feeding it hundreds of thousands of images that have been manually captioned by humans. The AI analyses the image to find out what’s in it and then associates the contents with the supplied caption. After the process is complete, it can begin to use elements of the captions in images that are visually similar to those used in the training process.

Google’s AI image captioning software

Google

The upgrade has been a success, according to Google. The new algorithm offers almost 94 percent accuracy, compared with the 89.6 percent obtained by Inception V1 back in 2012.
“Excitingly, our model does indeed develop the ability to generate accurate new captions when presented with completely new scenes, indicating a deeper understanding of the objects and context in the images,” said Google. “Moreover, it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.”
This kind of image recognition technology has a range of applications once perfected. It could help to improve the accuracy of image search results, or provide automatic screen reader descriptions of pictures when a website doesn’t provide them. The captions would also be useful in photo apps including Google’s own Photos.

In this article:AI, Artificial Intelligence, Google, Machine Learning, tensorflow

Written By James Walker

Business

United by grief, families of Boeing crash victims demand justice

Catherine Berthet (L) and Naoise Ryan (R) join relatives of people killed in the Ethiopian Airlines Flight 302 Boeing 737 MAX crash at a...

AFP4 hours ago

Turkey's central bank holds its key interest rate steady at 50 percent

Business

Turkey central bank holds key interest rate steady

Turkey's central bank holds its key interest rate steady at 50 percent - Copyright AFP MARCO BERTORELLOFulya OZERKANTurkey’s central bank held its key interest...

AFP16 hours ago

A vendor sweats as he pulls a vegetable cart at Bangkok's biggest fresh market, with people sweltering through heatwaves across Southeast and South Asia

World

Heatstroke kills 30 in Thailand this year as Southeast Asia bakes

A vendor sweats as he pulls a vegetable cart at Bangkok's biggest fresh market, with people sweltering through heatwaves across Southeast and South Asia...

AFP17 hours ago

Should the outcome of the EU investigation go against Microsoft, the firm could face a heavy fine or other ordered remedies

Tech & Science

Microsoft, Google earnings shine as AI drives revenue

Microsoft and Google drubbed quarterly earnings expectations.

AFP5 hours ago

Digital Journal

Tech & Science

Google’s AI can now caption images almost as well as humans

Trending

World

Op-Ed: Last gasp of stupidity — An American civil war as a serious topic

Tech & Science

InstaDeep CEO takes AI from Tunis to London

Sports

Verstappen wins Chinese Grand Prix to increase title grip

Business

Q&A: Cloud migration mistakes in 2024

Tech & Science

In Brazil, hopes to use AI to save wildlife from roadkill fate

You may also like:

Business

United by grief, families of Boeing crash victims demand justice

Business

Turkey central bank holds key interest rate steady

World

Heatstroke kills 30 in Thailand this year as Southeast Asia bakes

Tech & Science

Microsoft, Google earnings shine as AI drives revenue