Computer graphics is used to create an image from a 3D model. In computer vision instead, a model of the object of interest is built and eventually used to identify objects in an image to understand what is happening in the image.
Computer vision has many applications. Carmakers are beginning to integrate it into their products as they move toward autonomous vehicles. Parts manufacturers are using it to facilitate robot gripping and to detect parts with defects. Computer vision is also widely used to analyze medical and satellite images. In fact, it can be found in virtually any field.
Computer vision systems are often built using automatic learning methods (neural networks). These artificial intelligence methods do not rely on programming rules to perform tasks (e.g., if, then, else rules), but use examples to provide the information needed for a model (e.g. colour, weight, shape, size). This way, you don’t need a domain expert to develop computer vision system. The model itself, with enough data, can extract the information needed to perform the task.
Costly Computer Vision Techniques
Modern computer vision techniques are based on deep learning, using networks composed of several layers of representation. These techniques are complex and offer limited access. They require thousands or even millions of image examples (training data) showing the object that must be recognized to train a model. And that’s not all: annotations of the object of interest must first be taken by a human expert in order to help the computer interpret the images correctly—a fairly costly process. Many companies with limited resources, like SMEs, now need machine vision to remain competitive in the international market.
At the ÉTS Imaging, Vision and Artificial Intelligence Laboratory (LIVIA), I’m working with my students to reduce the complexity of computer vision methods and make them more accessible to SMEs, in an effort to democratize them. Specifically, we are developing new ways of developing the learning algorithms that require less data and fewer annotations for training, and models that require lower computational costs for training and deployment.
Learning Based on Preferential Sampling
To train a computer vision model and learn also its configurations (e.g. number of neurons or layers in an artificial neural network), we use learning techniques based on importance sampling. This method can train a better model, faster. It consists in performing repeatedly the following series of operations:
Select a sample data set according to the learned distribution;
Update the model based on the sample taken;
Update the distribution based on the model’s performance.
By sampling both data and model, this method quickly finds the best possible model configurations.
Conclusion
Some of society’s biggest challenges—environment, communications, labour shortages—will require cutting-edge technologies like computer vision. It’s on us to make the best possible use of them to improve our society.