Deep learning has revolutionized several areas of research including computer vision and robotics. A lot of work has followed in the last decade developing improved deep architectures for better learning of internal representations tuned to the end task, for example, contrastive representation learning for visual place recognition. This has also been motivated by ever increasing computational power of GPUs. However, robots as embodied agents can only leverage the GPU power in a limited manner due to constraints of power, size, weight, and cost. To mitigate this, several methods have been developed that enable deep learning architectures to work on either CPUs or smaller GPUs. This includes approaches based on network pruning, weight quantization and distillation learning like methods, to name a few. Distinct and complementary to all these methods, this PhD project will explore how the deep architectures themselves can be made smaller. This can be achieved by reducing the total number of layers but by first understanding what function does additional layers perform beyond that of expanding the receptive field to learn long-range features.
The PhD student will get an amazing opportunity to dive deep into deep learning to understand it even better and be able to come up with novel architectures and corresponding novel learning techniques, with an initial focus on visual representation learning.