Google Has Another Machine Vision Breakthrough?

googleimagerec
via I Programmer

The result of the change to the basic algorithm is a speed up of around 20,000 times, which is astounding.

Google Research has just released details of a Machine Vision technique which might bring high power visual recognition to simple desktop and even mobile computers. It claims to be able to recognize 100,000 different types of object within a photo in a few minutes – and there isn’t a deep neural network DNN mentioned.

There has always been a basic split in machine vision work. The engineering approach tries to solve the problem by treating it as a signal detection task using standard engineering techniques. The more “soft” approach has been to try to build systems that are more like the way humans do things. Recently it has been this human approach that seems to have been on top, with DNNs managing to learn to recognize important features in sample videos. This is very impressive and very important, but as is often the case the engineering approach also has a trick or two up its sleeve.

In this case we have improvements to the fairly standard technique of applying convolutional filters to an image to pick out objects of interest. The big problem with convolutional filters is that you need at least one per object type you are looking for – there has to be a cat filter, a dog filter, a human filter and so on. Given that the time it takes to apply a filter doesn’t scale well with image size, most approaches that use this method are limited to a small number of categories of object.

This year’s winner of the CVPR Best Paper Award, co-authored by Googlers Tom Dean, Mark Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan and Jay Yagnik, describes technology that speeds things up so that many thousands of object categories can be used and the results can be produced in a few minutes with a standard computer.

The technique is complicated, but in essence it makes use of hashing to avoid having to compute everything each time. Locality sensitive hashing is use to lookup the results of each step of the convolution – that is, instead of applying a mask to the pixels and summing the result, the pixels are hashed and then used as a lookup in a table of results. They also use a rank ordering method which indicates which filter is likely to be the best match for further evaluation. The use of ordinal convolution to replace linear convolution seems to be as important as the use of hashing.

The result of the change to the basic algorithm is a speed up of around 20,000 times, which is astounding.

See Also
Credit: Unsplash/CC0 Public Domain

Read more . . .

 

The Latest Bing News on:
Machine Vision Breakthrough
The Latest Google Headlines on:
Machine Vision Breakthrough

[google_news title=”” keyword=”Machine Vision Breakthrough” num_posts=”10″ blurb_length=”0″ show_thumb=”left”]

The Latest Bing News on:
Machine vision
The Latest Google Headlines on:
Machine vision

[google_news title=”” keyword=”machine vision” num_posts=”10″ blurb_length=”0″ show_thumb=”left”]

What's Your Reaction?
Don't Like it!
0
I Like it!
0
Scroll To Top