Demonstrator of bio-inspired visual attention

Bio-inspired visual attention demonstrator

Although there are several models of visual saliency, in terms of contrast and cognition, there is no hybrid model integrating both mechanisms of attention: the visual aspect and the cognitive aspect. In this work, we proposed a combined visual attention model including bottom-up and top-down directions of attention, as well as decision module which completes the combined attention module.
We developed the bottom-up model based on several visual saliency concepts, and we used genetic algorithm-based approach to tune this model. And we used several state-of-art object recognition techniques, for the top-down attention. The simulations and experimentation have shown the emergence of human-like attention (eye fixation).

Let us assume a complex visual attention problem: The robot has to find fire extinguisher, which is situated somewhere in the building. In our experiement, the room contains several objects, both known and not known to the robot, which are not of the interest for it in this particular task. If the robot (pepper robot) doesn’t see the exact item of interest (FireExtinguisher), it turns around for 60 degrees, and starts new iteration. When the robot sees the already seen scene, is starts to search for evacuation door in order to get out of the room where it has already seen everything. The most important results of the experiment are shown in the figure below. First column shows the original input images from the most important iterations – robot sees non-interesting but known objects, robot sees human, robot sees evacuation door, robot sees desired object. Second column represents the input images with the bounding rectangles, depicting the different recognition results. Orange rectangles show recognition of known objects by keypoint-based BRISK algorithm, and blue rectangles show recognition of faces by pattern-based Viola-Jones framework. Third column shows the final saliency map, mixed already from bottom-up saliency map and top-down importance map where the red point represents the most important object (eye fixation).