Curiosity-driven Learning for Visual Understanding
Curiosity guides humans to learn efficiently. It incentivizes us to spend more energy and time examining new, unexpected things, and to disregard those we fully understand already, to make our learning more efficient. Much of the vision learning that is done today is passive: learning systems are exposed to large amounts of training data, and learn from each sample multiple times, regardless of their current ability to recognize them at the time. This makes the process slow, specially given the increasingly large number of samples on datasets. In this project our goal is to enable learning systems with the ability to have curiosity, based on their ability to already understand the world around them. This curiosity would make learning faster and more efficient.
For example, in the classic case of "passive" computer vision, this would mean that systems could choose the order in which data is exposed, or disregard the data that has already been understood, to optimize the amount of time it takes to learn. Taking this active approach one step further, curiosity can also encourage intelligent systems to make predictions about the visual world and validate them through action. In the scope of this project this could be done, to start with, with one of the action simulators (like Gibson-env), for tasks like recognition, segmentation, or depth and motion estimation. We expect that, compared with the current learning strategy (of randomly choosing data, and being exposed to it passively), the curiosity-driven approach would make learning more effective and efficient.