Google Brain fills in the details of very lo-res images

by Mark Tyson on 8 February 2017, 14:31

Tags: Google (NASDAQ:GOOG)

Quick Link: HEXUS.net/qadd4u

Add to My Vault: x

How often have you sighed when watching Hollywood crime fighters zoom in to a blurred / grainy / pixelated image to reveal sharper and clearer details? Some level of suspension of disbelief is necessary to enjoy sci-fi but current day crime dramas should stay clear…

Now the Google Brain team might be onto something similar to the above fantasy technology with its 'Pixel Recursive Super Resolution'. For a quick understanding of the synthesising model capabilities have a look at the image matrix and abstract below.

Somehow, almost incredibly, the pixel recursive super resolution model takes the 8x8 pixel inputs shown above (left column) and produces the 32x32 pixel samples shown alongside (middle column). The accuracy is rather good when compared against the 'ground truth' unpixelated original image in the final column.

The Google Brain team behind the model explains that previous models have averaged various details to produce blurry indistinct images, little more useful than the pixelated input. In contrast the pixel recursive super resolution model uses two neural net processes in concert to create a sharp and recognisable image. To create these images the model must make hard decisions about the type of textures, shapes, and patterns present at different parts of an image.

First of all, the model maps the pixels in the lo-res sample to a similar high resolution one using a learned 'conditional network' ResNet process. This narrows down options for the next neural net process to work on. The second 'PixelCNN network' process looks to add detail to the pixellated image based upon similar source images with similar pixel locations. When both neural net processes have run the image data is combined to provide the 32x32 samples in the main image above.

Currently the 'Pixel Recursive Super Resolution' model is trained up for working on cropped celeb faces and hotel bedrooms.

Source: Google Brain research paper (PDF), via Engadget.



HEXUS Forums :: 12 Comments

Login with Forum Account

Don't have an account? Register today!
So the CSI ‘computer enhance’ meme is starting to come true? That something I never expected. Wonder if this will help with crimes - You couldn't use it as evidence but could help with suspect wanted mug shots…
cheesemp
So the CSI ‘computer enhance’ meme is starting to come true? That something I never expected. Wonder if this will help with crimes - You couldn't use it as evidence but could help with suspect wanted mug shots…

No. It creates very “Hollywood” faces because it's creating believable composites from overlapping images in a database that match the angle. If the person looked like the image, it would be chance (and a relatively low chance, given that your average Joe/Jo is not photogenic).

In the same way, look at the images with involve windows. It's smart enough to track subtle gradation of light from right to left across an image, pick up a frame shape, and decide that this thing on the right is: a light source; a gentle enough light source to be a window; a size and shape consistent with a window. But in one of Google's shots, the window is a skylight cut into the slanting roof of an attic room. To replace the window, the software has to insert a composite window-ish element, and it has vertical frames like the ones in its database.
cheesemp
So the CSI ‘computer enhance’ meme is starting to come true? That something I never expected. Wonder if this will help with crimes - You couldn't use it as evidence but could help with suspect wanted mug shots…

No, the information in the fuzzed out images is gone and cannot ever be restored. This tech could make for a cracking TV upscaler, but not evidence.

The network was trained with images of celebs. Given a fuzzed input, it hallucinates a celeb based on the input. Given a picture of me, it would hallucinate the closest celeb like face it could.

I look forward to TVs having a de-pixellation mode to make tamed tv footage rude again, it would work very well for that :D
DanceswithUnix
The network was trained with images of celebs. Given a fuzzed input, it hallucinates a celeb based on the input. Given a picture of me, it would hallucinate the closest celeb like face it could.
So cameo appearances are going to soar and celebs themselves will begin to wonder if they had accidentally wandered on the set of some film without knowing it?
DanceswithUnix
No, the information in the fuzzed out images is gone and cannot ever be restored. This tech could make for a cracking TV upscaler, but not evidence.

The network was trained with images of celebs. Given a fuzzed input, it hallucinates a celeb based on the input. Given a picture of me, it would hallucinate the closest celeb like face it could.

I look forward to TVs having a de-pixellation mode to make tamed tv footage rude again, it would work very well for that :D

Um - Thats why I stated not for evidence but for mug shots. Sure it might be just a very generic face in the future but its better than some useless CCTV police often show now when try to find a wanted person.

Edit: Also don't forget this is version 1 of this tech. I'm sure it will improve to support more generic faces.