Connect with us

Hi, what are you looking for?

Tech & Science

Google’s AI can now caption images almost as well as humans

Google’s image captioning software is part of its wider TensorFlow machine learning kit. Today, it announced a new release of the algorithm that comes with substantially improved performance. It is able to make more accurate descriptions that include more detail, enabling it to caption images with a standard as high as humans.
In a blog post, Google provided some examples of images captioned by the new algorithm. They include “A person on a beach flying a kite” and “A man riding a wave on top of a surfboard.” The company also highlighted the accuracy improvements made since the last-generation technology. It now captions “A brown bear is swimming in the water” as “Two brown bears sitting on top of rocks,” a description that is more applicable to the contents of the image.
Google has made the improvements by switching to the Inception V3 model for the image encoder. This gives the image captioning system an improved ability to recognise individual objects with images. In turn, this directly facilitates more detailed descriptions. The Inception V3 model achieves 93.9 percent accuracy on the ImageNet classification task.

Google s AI image captioning software

Google’s AI image captioning software
Google


The image model has also been fine-tuned to emphasise describing images rather than classifying them. Inception V3 is primarily aimed at grouping images into categories, rather than describing their contents. Google has optimised the model so it can turn “a dog, grass and a frisbee are in the image” into a natural language response including the colour of the grass and the position of the dog relative to the frisbee.
The final significant improvement has been made to the training process. Google trains the algorithm by feeding it hundreds of thousands of images that have been manually captioned by humans. The AI analyses the image to find out what’s in it and then associates the contents with the supplied caption. After the process is complete, it can begin to use elements of the captions in images that are visually similar to those used in the training process.

Google s AI image captioning software

Google’s AI image captioning software
Google


The upgrade has been a success, according to Google. The new algorithm offers almost 94 percent accuracy, compared with the 89.6 percent obtained by Inception V1 back in 2012.
“Excitingly, our model does indeed develop the ability to generate accurate new captions when presented with completely new scenes, indicating a deeper understanding of the objects and context in the images,” said Google. “Moreover, it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.”
This kind of image recognition technology has a range of applications once perfected. It could help to improve the accuracy of image search results, or provide automatic screen reader descriptions of pictures when a website doesn’t provide them. The captions would also be useful in photo apps including Google’s own Photos.

Written By

You may also like:

Business

Catherine Berthet (L) and Naoise Ryan (R) join relatives of people killed in the Ethiopian Airlines Flight 302 Boeing 737 MAX crash at a...

Business

Turkey's central bank holds its key interest rate steady at 50 percent - Copyright AFP MARCO BERTORELLOFulya OZERKANTurkey’s central bank held its key interest...

World

A vendor sweats as he pulls a vegetable cart at Bangkok's biggest fresh market, with people sweltering through heatwaves across Southeast and South Asia...

Tech & Science

Microsoft and Google drubbed quarterly earnings expectations.