The system was developed by researchers working on Google Brain. It’s based on a pixel recursive super resolution model that allows pixelated, low-resolution images to be dynamically enhanced. It reduces blur, fills in details and eventually pieces together a high-resolution copy.
Google Brain uses two neural networks to create the output images. Working with an input file containing 8×8 pixels, it attempts to match the low-resolution source with an existing high-resolution image. Each high-resolution image is downscaled so it’s also 8×8 pixels in size. It’s compared with the input file to work out if the two are likely to be the same.
The neural network then begins to work backwards towards the high-resolution image being matched. It starts adding in details based on pixels previously “learned” by scanning large libraries of content. Traditional machine learning training methods can be used in this stage. The context of the high-resolution image serves as an indicator to work out whether the result is accurate.
The system has seen some success when tested by human users. Google showed two images, one upscaled by Brain and one directly from a camera, to crowd-sourced workers. The participants had to identify which image came from the camera. In the celebrity faces test, 10 percent of people thought Brain’s image was the real photo. This rose to 28 percent when using a photo of a bedroom.
Although Brain can produce realistic images, it can’t create a true photo. Without knowing what details were present in the scene, Brain has to guess what to add by looking at what the environment usually contains. In sample shots shown by Google, Brain’s rendition of a bedroom looks convincing but does not contain the same items as the real image.
“A super resolution model must account for the complex variations of objects, viewpoints, illumination, and occlusions, especially as the zoom factor increases,” Google said in its paper. “When some details do not exist in the source image, the challenge lies not only in ‘deblurring’ an image, but also in generating new image details that appear plausible to a human observer.”
The technology presented in crime dramas still isn’t with us yet. In its current state, Brain is a long way from being useful in forensics. It guesses what to insert into the scene, preventing it creating real images. Although it’s an educated guess, it leads to significant inaccuracies in the final result.
Neural networks are advancing rapidly though, potentially enabling true zoom and enhance to be developed within the next few years. With some more work, Brain could be used to zoom-in on writing in a photo’s background. In the future, it may be able to build a more detailed version of a suspect’s face captured on a blurry CCTV camera.